This text was copied from #116 and improved slightly.
This enhancement describes a change that may or may not be an improvement. We will need to investigate whether it actually helps.
Cozy is very sloppy about when it iterates over a collection and when it constructs a new collection. The code generation step decides when to do which one, and the cost model has to know about which choice will be made in which situation.
For quite some time I have been considering pushing this difficult tradeoff to the synthesis engine. To do this, the synthesis engine will need to know the difference between an iterative operation and an in-memory collection. Here's more or less how it would look:
Introduce a new Stream type
Introduce constant-time "iteration" primitives for existing collections: e.g. listToStream
Introduce linear-time "collection" primitives: e.g. streamToList
Convert the aggregation operators like Sum and Length to stream operations, not collection operations.
Convert Map, Filter, and FlatMap to stream operations, not collection operations.
Introduce new constant-time operations like ListLength that work for lists, not streams. (Related: #116)
Streams may only be iterated over once, so we must:
Disallow EStateVar(e) when e's type contains a Stream (but it is still allowed if e requires stream operations to compute).
Disallow let-binding with expressions whose type contains a Stream (but it is still allowed if e requires stream operations to compute).
There are probably some other details I'm forgetting, but those points capture the gist of it. The effect of this change would be that Cozy's synthesizer reasons about when to use iteration. There would be fewer choices for the code generator to make. Hopefully there would also be better alignment between the cost model and the code generator.
This text was copied from #116 and improved slightly.
This enhancement describes a change that may or may not be an improvement. We will need to investigate whether it actually helps.
Cozy is very sloppy about when it iterates over a collection and when it constructs a new collection. The code generation step decides when to do which one, and the cost model has to know about which choice will be made in which situation.
For quite some time I have been considering pushing this difficult tradeoff to the synthesis engine. To do this, the synthesis engine will need to know the difference between an iterative operation and an in-memory collection. Here's more or less how it would look:
There are probably some other details I'm forgetting, but those points capture the gist of it. The effect of this change would be that Cozy's synthesizer reasons about when to use iteration. There would be fewer choices for the code generator to make. Hopefully there would also be better alignment between the cost model and the code generator.