twitter / summingbird

Streaming MapReduce with Scalding and Storm
https://twitter.com/summingbird
Apache License 2.0
2.14k stars 267 forks source link

Support grouped leftJoin in summingbird #664

Open pankajroark opened 8 years ago

pankajroark commented 8 years ago

Use case is leftJoin with a service with in-memory caching on Storm/Heron nodes. Using groupBy data for an id will end up in one node only and with in-memory caching we can reduce qps to the service we want to join with.

pankajroark commented 8 years ago

Capturing related gitter conversation here: This request boils down to key grouping on left join. We could do this by providing an option on leftJoin indicating that key grouping should be used. Another option is to create a GroupedService that encapsulates this notion.

pankajroark commented 8 years ago

A problem with keyed leftJoin for online platform is hot keys. Map side aggregation mitigates the hotkey issue for sumByKey operation but doesn't apply here. In case of keyed leftJoin it would be good to also provide a parallelism setting for keying. i.e. a key could be allowed to go to more than 1 fixed set of nodes, for that key, as specified by the parallelism setting. A perfect solution would be to vary this parallelism setting dynamically but that would be very complex to implement, currently there's nothing dynamic in the online job. For now a static setting should suffice.

pankajroark commented 7 years ago

The main use-case for this is to enable efficient in-memory caching in the online job. This has to do with the job and not the service. So I think having a GroupedService doesn't seem intuitive, plus what counterpart would GroupedService have on scalding platform. I'm leaning towards enabling the grouping via an option.

pankajroark commented 7 years ago

Hot keys would be an issue in some cases and not in others. We should give a choice to enable handling of hot keys.

So the option might look like: EnableGroupedJoin(shuffleHotKeys: Boolean)

For handling hot keys I'm thinking about the idea of detecting hot keys and handling them separately. We can use a CMS to detect the hot keys and then use CustomStreamGrouping to fanout hotkeys to all downstream nodes, like the shuffle grouping in storm, whereas fanout the grouped for rest of the keys. This way we would still benefit from caching and avoid hot key issues.

Users who need strict key locality might not be able to use hot key shuffling. I think it is ok to say that we don't support this specific use case right now. We haven't seen anyone ask for that. We have many users who want to use grouped leftJoin though. We are hoping to start working on this very soon.

johnynek commented 7 years ago

I think this is something we want this in general, and I think the option is a good one.

I think it is really interesting to not shuffle the hot keys. I'm not sure it is worth the win here to treat heavy and light differently. I think if the topology has to be prepared for heavy keys to be sent over, might as well use that for the light keys as well.

Then you can let the ReadableStore implementation handle the caching rules.

pankajroark commented 7 years ago

I'm not sure if I understood correctly. Shuffling(distributing equally to all downstream nodes) both hot keys and light keys would mean not doing any grouping at all.

johnynek commented 7 years ago

sorry, I don't mean random shuffle, I mean map-reduce style hash-by-key shuffling

johnynek commented 7 years ago

so, my point is, just do the normal group-by-key. Hot keys will get a lot of items coming to them, but they can batch them up and use caching on read.

The load has to go somewhere. If you fan out hot-keys to balance, like we do in sketch-join in scalding, then the store is going to get hit a bunch. Maybe that is okay, if you tune it right and share the load across the storm/heron nodes and the store. But we should probably make a doc on this rather than just going right to code.

Even a markdown doc in the repo where we can have a PR and comment might be good.

pankajroark commented 7 years ago

The problem with hot keys is that the number of events can be so high that just handling the network bytes and deserialization can choke the instance. We typically see very low number of hot keys, just one or two, it's usually a very very small fraction, not enough to increase the load on the store by much if we random shuffle it.

Yes, I'm just starting to work on a design doc for this. I will share soon.

johnynek commented 7 years ago

so, you want to randomly shard hot keys, but key-shard cold/light keys. Is this right?

Because just the network traffic of the hot keys to one node kills you. No problem. As you say, you use a moving CMS to check heavy hitters, and then if you are in the heavy hitters, you add a random number to the key, else add 0 to the key or something.

pankajroark commented 7 years ago

Interesting. So we wrap the key while sending and unwrap on the other side. That would avoid the complexity of adding custom stream grouping but add a little cost in wrapping/unwrapping, overhead might not be much though. I'll capture that in the doc.

On Fri, Jan 6, 2017 at 12:38 PM P. Oscar Boykin notifications@github.com wrote:

so, you want to randomly shard hot keys, but key-shard cold/light keys. Is this right?

Because just the network traffic of the hot keys to one node kills you. No problem. As you say, you use a moving CMS to check heavy hitters, and then if you are in the heavy hitters, you add a random number to the key, else add 0 to the key or something.

— You are receiving this because you authored the thread.

Reply to this email directly, view it on GitHub https://github.com/twitter/summingbird/issues/664#issuecomment-270999988, or mute the thread https://github.com/notifications/unsubscribe-auth/AAojhsbkxDuQXVopTlFApXXih7JomyG9ks5rPqY3gaJpZM4I1a2y .

pankajroark commented 7 years ago

@johnynek Shared the design doc with you.