twitter / scalding

A Scala API for Cascading
http://twitter.com/scalding
Apache License 2.0
3.49k stars 703 forks source link

Fix the WritePartitioner to exactly match cascading #1805

Closed johnynek closed 6 years ago

johnynek commented 6 years ago

We were failing the laws previously, sometimes taking more than n + 1 steps where cascading took n. By improving the logic, we fix those bugs and reach actually exactly matching cascading.

This should allow use batching to bypass any case of cascading taking too long to plan.

johnynek commented 6 years ago

@non helped me get the algorithms right here. Thanks!

johnynek commented 6 years ago

fixes #1804

ianoc commented 6 years ago

lgtm