twitter / scalding

A Scala API for Cascading
http://twitter.com/scalding
Apache License 2.0
3.5k stars 706 forks source link

TypedPipe.groupRandomly() use Ordering with EquivSerialization #1934

Closed tlazaro closed 4 years ago

tlazaro commented 4 years ago

Seen "Scalding's ordered serialization logic exhausted the finite supply of boxed classes." error when creating many parallel Executions from the same job. We traced it to groupRandomly() not using an EquivSerialization in its groupBy(), taking up new slots of boxed classes for every Execution.

Converted identityOrdering to a case object and extended EquivSerialization following the pattern of com.twitter.scalding.serialization.UnitOrderedSerialization.

CLAassistant commented 4 years ago

CLA assistant check
All committers have signed the CLA.