modelop / hadrian

Implementations of the Portable Format for Analytics (PFA)
Apache License 2.0
130 stars 49 forks source link

PFA chain not behaving as expected #25

Closed bmwilly closed 7 years ago

bmwilly commented 7 years ago

I'm in a situation where I have two YAML strings that define valid PFA models and can be evaluated in sequence, but when trying to chain them together, I get unexpected renaming of records which results in a PFASemanticException.

I've constructed a minimal reproducible example:

import yaml
from titus.genpy import PFAEngine
from titus.producer import chain

doc1 = '''
input:
  type: map
  values: double
output:
  type: int

cells:
  variables:
    type:
      type: array
      items:
        type: record
        name: Variable
        fields:
          - name: vname
            type: string
          - name: mean
            type: double

fcns:
  adjust:
    params:
      - v: Variable
      - vars:
          type: map
          values: double
    ret: double
    do:
      - 1

action:
  - let:
      vars:
        a.map:
          - cell: variables
          - fcn: u.adjust
            fill: {vars: input}

  - 1
'''

doc2 = '''
input:
  type: int
output:
  type: int

action:
  - 2
'''

m1 = yaml.load(doc1)
m2 = yaml.load(doc2)
m1['cells']['variables']['init'] = [{'vname': 'var1', 'mean': 0.5},
                                    {'vname': 'var2', 'mean': -0.25}]
y1 = yaml.dump(m1)
y2 = yaml.dump(m2)

pfa1, = PFAEngine.fromYaml(y1)
v = {'var1': 1, 'var2': 2}
x = pfa1.action(v)
pfa2, = PFAEngine.fromYaml(y2)
pfa2.action(x)

returns 2 as expected but

y3 = chain.json([y1, y2], tryYaml=True, verbose=True)

results in

Thu Nov 10 17:01:04 2016 Converting all inputs to ASTs
Thu Nov 10 17:01:04 2016     step 1
Thu Nov 10 17:01:04 2016     step 2
Thu Nov 10 17:01:04 2016 Changing type names to avoid collisions
Thu Nov 10 17:01:04 2016 Verifying that input/output schemas match along the chain
Thu Nov 10 17:01:04 2016 Adding [name, instance, metadata, actionsStarted, actionsFinished, version] as model parameters
Thu Nov 10 17:01:04 2016 Converting scoring engine algorithm
Thu Nov 10 17:01:04 2016     step 1: Engine_112
Thu Nov 10 17:01:04 2016     step 2: Engine_113
Thu Nov 10 17:01:04 2016 Create types for model parameters
Thu Nov 10 17:01:04 2016     step 1:
Thu Nov 10 17:01:04 2016         cell variables
Thu Nov 10 17:01:04 2016 Resolving all types
Thu Nov 10 17:01:04 2016 Converting the model parameters themselves
Thu Nov 10 17:01:04 2016     step 1:
Thu Nov 10 17:01:04 2016         cell variables
Thu Nov 10 17:01:04 2016 Verifying PFA validity

PFASemanticExceptionTraceback (most recent call last)
 in ()
----> 1 y3 = chain.json([y1, y2], tryYaml=True, verbose=True)

/Users/brandon/.pyenv/versions/2.7.11/lib/python2.7/site-packages/titus/producer/chain.pyc in json(pfas, lineNumbers, check, name, randseed, doc, version, metadata, options, tryYaml, verbose)
    131     :return: a PFA document representing the chained workflow
    132     """
--> 133     return ast(pfas, check, name, randseed, doc, version, metadata, options, tryYaml, verbose).toJson(lineNumbers)
    134 
    135 def engine(pfas, name=None, randseed=None, doc=None, version=None, metadata={}, options={}, tryYaml=False, verbose=False, sharedState=None, multiplicity=1, style="pure", debug=False):

/Users/brandon/.pyenv/versions/2.7.11/lib/python2.7/site-packages/titus/producer/chain.pyc in ast(pfas, check, name, randseed, doc, version, metadata, options, tryYaml, verbose)
    658     if check:
    659         if verbose: sys.stderr.write(time.asctime() + " Verifying PFA validity\n")
--> 660         PFAEngine.fromAst(out)
    661 
    662     if verbose: sys.stderr.write(time.asctime() + " Done\n")

/Users/brandon/.pyenv/versions/2.7.11/lib/python2.7/site-packages/titus/genpy.pyc in fromAst(engineConfig, options, version, sharedState, multiplicity, style, debug)
   1445         pfaVersion = titus.signature.PFAVersion.fromString(version)
   1446 
-> 1447         context, code = engineConfig.walk(GeneratePython.makeTask(style), titus.pfaast.SymbolTable.blank(), functionTable, engineOptions, pfaVersion)
   1448         if debug:
   1449             print code

/Users/brandon/.pyenv/versions/2.7.11/lib/python2.7/site-packages/titus/pfaast.pyc in walk(self, task, symbolTable, functionTable, engineOptions, pfaVersion)
    919             ufname = "u." + fname
    920             scope = topWrapper.newScope(True, False)
--> 921             fcnContext, fcnResult = fcnDef.walk(task, scope, withUserFunctions, engineOptions, pfaVersion)
    922             userFcnContexts.append((ufname, fcnContext))
    923 

/Users/brandon/.pyenv/versions/2.7.11/lib/python2.7/site-packages/titus/pfaast.pyc in walk(self, task, symbolTable, functionTable, engineOptions, version)
   1513             scope.put(name, avroType)
   1514 
-> 1515         results = [x.walk(task, scope, functionTable, engineOptions, version) for x in self.body]
   1516 
   1517         inferredRetType = results[-1][0].retType

/Users/brandon/.pyenv/versions/2.7.11/lib/python2.7/site-packages/titus/pfaast.pyc in walk(self, task, symbolTable, functionTable, engineOptions, version)
   3037 
   3038             scope = symbolTable.newScope(True, True)
-> 3039             exprContext, exprResult = expr.walk(task, scope, functionTable, engineOptions, version)
   3040             calls = calls.union(exprContext.calls)
   3041 

/Users/brandon/.pyenv/versions/2.7.11/lib/python2.7/site-packages/titus/pfaast.pyc in walk(self, task, symbolTable, functionTable, engineOptions, version)
   2007 
   2008         else:
-> 2009             raise PFASemanticException("parameters of function \"{0}\" do not accept [{1}]".format(self.name, ",".join(map(ts, argTypes))), self.pos)
   2010 
   2011         return context, task(context, engineOptions)

PFASemanticException: PFA semantic error at YAML lines 3-7 (PFA field "action -> 0 -> let -> vars"): parameters of function "a.map" do not accept [
    array(record(Step1_Engine_112_Variable,
                 vname: string,
                 mean: double))
,
    fcn(arg1: record(Variable,
           vname: string,
           mean: double) -> double)
]

In case it's helpful, I'm on macOS Sierra (10.12.1) and

➜  ~ pip show titus
Name: titus
Version: 0.8.4
Summary: Python implementation of Portable Format for Analytics (PFA): producer, converter, and consumer.
Home-page: UNKNOWN
Author: Open Data Group
Author-email: support@opendatagroup.com
License: UNKNOWN
Location: /Users/brandon/.pyenv/versions/2.7.11/lib/python2.7/site-packages
Requires: avro, ply
mahowald-odg commented 7 years ago

I have pushed a fix for this issue. I'll merge this fix into the daily and master branches once we're done testing the fix.

bmwilly commented 7 years ago

@mahowald-odg excellent, thanks!

bmwilly commented 7 years ago

The referenced commit fixes the issue. Thanks!

rgrinberg commented 7 years ago

@mahowald-odg did this make into any releases yet?

mahowald-odg commented 7 years ago

@rgrinberg No, but the fix is in the daily branch. It will make it into the next release.