Open jackfirth opened 4 years ago
Note: should probably gather together references to prior art here. Stream fusion is a very widely studied problem:
for
forms, of course.Stream<T>
interface and allow stream implementations to override those methods. The stream returned by stream.distinct()
could be a DistinctStream<T>
that overrides the distinct()
method to be a no-op, so stream.distinct().distinct()
doesn't try to dedupe the stream twice. (I have no idea if that's what actually happens, but that's the general idea.)
These two transducer pipelines should have the exact same performance:
That is, there should be rules for statically fusing transducers, similar to how
for
forms will specially recognize the variousin-list
andin-vector
forms. Any function that accepts a chain of transducers, such astransduce
andtransducer-compose
(#191), should be wrapped in a macro that looks for fusion opportunities. There are two primary performance benefits:Dead transducer elimination. In the example above, the
(taking 20)
transducer is completely unnecessary because there's a(taking 5)
transducer downstream, and the only transducers in between have no effect on the number of elements or their order. Transducer fusion should detect and eliminate dead transducers.Fewer element exchanges between transducers. Each time a transducer consumes or emits a value, some contracts need to be checked and a variant need to be constructed to wrap the transducer's next state. Fusion can reduce this checking, especially in degenerate cases.
Alternatives considered
There could be some sort of protocol for dynamic transducer fusion, using generic interfaces. The upside is that this kind of code could still trigger fusion:
However, the downsides are significant: