modelop / hadrian

Implementations of the Portable Format for Analytics (PFA)
Apache License 2.0
130 stars 49 forks source link

[Titus] Option to make chain deterministic #30

Open rgrinberg opened 7 years ago

rgrinberg commented 7 years ago

Here's a little example:

from titus.producer import chain
import json

j1 = """
{"input": "int",
 "output": {"type": "record",
            "name": "Output",
            "fields": [{"name": "one", "type": "int"},
                       {"name": "two", "type": "double"},
                       {"name": "three", "type": "string"}]},
 "action":
   {"type": "Output",
    "new": {"one": "input", "two": "input", "three": {"s.int": "input"}}}}
"""

j2 = """
{"input": {"type": "record",
           "name": "Output",
           "fields": [{"name": "one", "type": "int"},
                      {"name": "two", "type": "double"},
                      {"name": "three", "type": "string"}]},
 "output": "string",
 "method": "emit",
 "action": [
   {"emit": "input.three"},
   {"emit": "input.three"},
   {"emit": "input.three"}]}
"""

p1 = json.loads(j1)
p2 = json.loads(j2)

x = chain.json([p1, p2], randseed=1)
y = chain.json([p1, p2], randseed=1)

At the end of this, x ends up being not equal to y. This is quite annoying for testing purposes. Would it be possible to make this function pure? Or at least add an option?