twitter / summingbird

Streaming MapReduce with Scalding and Storm
https://twitter.com/summingbird
Apache License 2.0
2.14k stars 267 forks source link

Externalize Monoid in flat map and summer bolts #766

Closed ttim closed 6 years ago

ttim commented 6 years ago

Currently Monoid instance passed to sumByKey is serialized through java serialization (for summingbird-storm platform), while other user's code (e.g. flatMap functions) is serialized through Externalizer (which tries java serialization and kryo in case it didn't work). Seems like this behaviour should be consistent and we should serialize Monoid through Externalizer as well.

In this PR I've added test for this and fixed it via putting Monoid into Externalizer box.

CLAassistant commented 6 years ago

CLA assistant check
All committers have signed the CLA.

ttim commented 6 years ago

It would be good to have a test for each part of the change such that it fails without it, but if we don’t have time I’m not opposed to merging.

When I was testing firstly I ran test without changes (failed with Tail bolt), then with only summer change (failed with FlatMap-Tail bolt) and then with both changes. Since this test covers both partial aggregation and full aggregation I think it's enough.