twitter / scalding

A Scala API for Cascading
http://twitter.com/scalding
Apache License 2.0
3.48k stars 704 forks source link

[prototype] Introduce `KeyGrouping` to make it possible to enable `OrderedSerialization` without code changes #1857

Open ttim opened 6 years ago

ttim commented 6 years ago

We're considering to enable OrderedSerialization for most users at Twitter. Currently we have a blocker for that - users needs to change a source code to enable it (and not only pass a parameter, which is much more beneficial for us because we have auto-tuning infrastructure to make incremental land of such things possible).

I've tried to prototype how we can workaround this - basic idea is to introduce KeyGrouping class which holds implicitly provided Ordering and generated OrderedSerialization. With this approach we can choose which one of them to use in runtime. Another pros of this approach:

Another thing which is nice about this approach is the fact it's source backward compatible.

This is just prototype (it's not even holds both OrderedSerialization and Ordering, I used it to gather Ordering's stats across our repo) but it illustrates the idea.

@johnynek what do you think?

CLAassistant commented 6 years ago

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

:white_check_mark: ttim
:x: Timur Abishev


Timur Abishev seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

johnynek commented 5 years ago

Thinking back at this...

We could have an alternative approach: require OrderedSerialization everywhere we use Ordering, but have a low priority implicit that uses Kryo to supply the implicit if you have an ordering.