Open DominikZuercherQC opened 2 weeks ago
I can confirm the expectation. I need to debug to find the reason.
It takes a bit longer to fix because of implementing some groundwork for searching similar caching issues. The problem seems already found in the order of Table.assumed_dependencies:
[<Table '_temp_res2_m3qtk77lrs6vpirivro5_0001' (stage)>, <Table '_temp_res2_m3qtk77lrs6vpirivro5_0000' (stage)>]
vs.
[<Table '_temp_res2_m3qtk77lrs6vpirivro5_0000' (stage)>, <Table '_temp_res2_m3qtk77lrs6vpirivro5_0001' (stage)>]
The problem does not always surface which means that non-run-stable hashing is probably to blame.
@windiana42 My expectation is that even when using imperative materialization in a task it should stay cache valid if its inputs did not change. In complex cases this can break.
This PR serves as a minimal example:
test_materialize.py::test_imperative_minimal_example