hydro-project / hydroflow

Hydro's low-level dataflow runtime
https://hydro.run/docs/hydroflow/
Apache License 2.0
472 stars 33 forks source link

Deprecate/refactor old stateful/replaying operators #1058

Closed MingweiSamuel closed 1 month ago

MingweiSamuel commented 7 months ago
jhellerstein commented 4 months ago

Replay on joins/bimorphisms is now working correctly by insisting on persisting the inputs (which is required semantically for bimorphism).

Remaining work plan is around references:

  1. Introduce an n-way cross operator to gather refs. Give it a meaningful name (not cross-join; maybe "tuple" or "gatherRefs" or something)
  2. Make sure/rewrite code so that output of the n-way cross is reflected in argument of its consumer.
  3. Check that replay of refs through the cross operator works as expected.
jhellerstein commented 4 months ago

We could do this now, but it would be better to gather more experience, and perhaps wait for more helpers in Hydroflow+ before we remove the more ergonomic existing "convenience" operators.

MingweiSamuel commented 2 months ago

Low priority of 0.9 release - hoping to drive from hydroflow+ use cases, and filter downward what works/is needed

And look at Shadaj's Flo paper