twitter / scalding

A Scala API for Cascading
http://twitter.com/scalding
Apache License 2.0
3.5k stars 706 forks source link

More direct map and filter in typed API #1772

Closed johnynek closed 6 years ago

johnynek commented 6 years ago

closes #1742

We actually have map and filter implementations in the untyped cascading DSL of RichPipe, but somehow we never used them in the typed API.

This is a shame because for filter we were hiding them from cascading, and we were boxing all maps into single iterators, which we don't need to do. Lastly, the iteration in the FlatMapFunction we skip since in a map you know there is exactly one.

Finally, I think cascading reuses the tuple with a filter, so there is less allocation. This may also be possible with map, but I'm not 100% clear about it.