yahoo / graphkit

A lightweight Python module for creating and running ordered graphs of computations.
Apache License 2.0
84 stars 24 forks source link

BUG: overriding intermediate data when no outputs asked #25

Open ankostis opened 4 years ago

ankostis commented 4 years ago

In the following diagram, all data are given, and asked is different, depending on whether we expicitly ask it in the outputs:

Code to reproduce it:

def test_pruning_not_overrides_given_intermediate():
    # Test #25: not overriding intermediate data when an output is not asked
    graph = compose(name="graph")(
        operation(name="unjustly run", needs=["a"], provides=["overriden"])(lambda a: a),
        operation(name="op", needs=["overriden", "c"], provides=["asked"])(add),
    )
    graph.net.plot('t.png')
    assert graph({'a': 5, 'overriden': 1, "c": 2}, ['asked']) == {'asked': 3}  # that's ok
    assert graph({'a': 5, 'overriden': 1, "c": 2}) == {'a': 5, 'overriden': 1, "c": 2, 'asked': 3}  # FAILs

Root cause:

Note that the pruning code in v1.2.4 is buggy (#24), so it cannot be used as is.

ankostis commented 4 years ago

Another "too pruning" but more complex example is when a multi-output operation MUST NOT run, not to override a given intermediate input, but it MUST run, to provide other outputs.

In the diagram below, must run is needed for e but must not override overriden. t

Code to reproduce:

def test_pruning_multiouts_not_override_intermediates():
    # Test #25: v.1.2.4 overrides intermediate data when a previous operation
    # must run for its other outputs (outputs asked or not)
    netop = compose(name="netop")(
        operation(name="must run", needs=["a"], provides=["overriden", "e"])
        (lambda x: (x, 2 * x)),
        operation(name="op1", needs=["overriden", "c"], provides=["d"])(add),
        operation(name="op2", needs=["d", "e"], provides=["asked"])(lambda x, y: x * y),
    )

    # FAILs
    # - on v1.2.4 with KeyError: 'e',
    # - # - on #18(unsatisfied) + #23(ordered-sets) with empty result.
    assert netop({"a": 5, "overriden": 1, "c": 2}, ["asked"]) == {"asked": 3}
    # FAILs
    # - on v1.2.4 with (overriden, asked) = (5, 70) instead of (1, 13)
    # - # - on #18(unsatisfied) + #23(ordered-sets) like v1.2.4.
    assert (
        netop({"a": 5, "overriden": 1, "c": 2})
        ==
        {"a": 5, "overriden": 1, "c": 2, "asked": 3})
ankostis commented 4 years ago

Or this even simpler one: t

def test_pruning_multiouts_not_override_intermediates1():
    # Test #25: v.1.2.4 overrides intermediate data when a previous operation
    # must run for its other outputs (outputs asked or not)
    netop = compose(name="netop")(
        operation(name="must run", needs=["a"], provides=["overriden", "calced"])
        (lambda x: (x, 2 * x)),
        operation(name="add", needs=["overriden", "calced"], provides=["asked"])(add),
    )
    netop.net.plot('t.png')
    # FAILs
    # - on v1.2.4 with KeyError: 'e',
    # - on #18(unsatisfied) + #23(ordered-sets) with empty result.
    assert netop({"a": 5, "overriden": 1, "c": 2}, ["asked"]) == {"asked": 3}
    # FAILs
    # - on v1.2.4 with (overriden, asked) = (5, 15) instead of (1, 1)
    # - on #18(unsatisfied) + #23(ordered-sets) like v1.2.4.
    assert (
        netop({"a": 5, "overriden": 1})
        ==
        {"a": 5, "overriden": 1, "calced": 10, "asked": 3})