TimelyDataflow / differential-dataflow

An implementation of differential dataflow using timely dataflow on Rust.
MIT License
2.51k stars 182 forks source link

Merge batcher input generic over containers #494

Closed antiguru closed 1 month ago

antiguru commented 1 month ago

Add infrastructure and implementations to support containers, both specific and general, from timely streams to merge batchers. This is the last piece to enable building arrangements from non-vector input data.

The change adds several new traits and implementations. The changes fall into roughly two categories:

  1. Capture behavior to transcribe streams of containers to chunks of sorted and consolidated data, suitable for integration in merge batchers.
  2. Split of merge batchers into chunk formation and chain maintenance.

The first allows spines to express more opinions on how streams of containers can be transformed into chunks, either by supplying specialized implementations that know exactly how to map specific containers, or by instructing a generic implementation with specific information about containers.

The second decouples the merge batcher's chain maintaining from forming chunks, which seems the right thing to do given that the two are distinct tasks.