TimelyDataflow / timely-dataflow

A modular implementation of timely dataflow in Rust
MIT License
3.25k stars 272 forks source link

Introduce non-clone streams #508

Open antiguru opened 1 year ago

antiguru commented 1 year ago

At the moment, any data sent through a Timely stream needs to be Clone because Tee needs to be able to clone data to send it to multiple recipients. This change tries to change this, although it comes with far-reaching and breaking changes.

I don't intend to merge this yet, but it seems like a solid basis for further experiments.

Specifically, it introduces new types and changes behavior:

The effect of this change is that for op --> op structures, we only have a vcall where we currently have a vector dereference plus vcall. This can be better in some situations, but I didn't measure it. The downside is that requesting a tee adds the vector dereference plus vcall after the first vcall, so it's strictly worse. This should be amortized by how infrequently it's used, but who knows.

In theory, we don't need the vcall at all because the type of the downstream operator is known to Rust and hence its pusher. However, I currently can't see how to wire it up, and I believe it's ugly because the information needs to flow backwards from the receiving operator to the source.