twitter / scalding

A Scala API for Cascading
http://twitter.com/scalding
Apache License 2.0
3.48k stars 704 forks source link

Improve the Monoid.sumOption to avoid stack overflow #1855

Closed johnynek closed 6 years ago

johnynek commented 6 years ago

Someone hit an issue making a giant list of TypedPipes with a stack overflow. This can be avoided by building a tree of merges rather than a big linear list (which also makes the depth of the graph lower, which generally should be a good thing).

This test was red when I started.

We could try to find other mitigations as well, but fundamentally we do stack-unsafe recursion when optimizing. Maybe it wouldn't kill us to use something like cats Eval to make it safe, but generally there is a big penalty for things like that.

johnynek commented 6 years ago

@ianoc take a look?

ianoc commented 6 years ago

lgtm