confluentinc / kafka-tutorials

Tutorials and Recipes for Apache Kafka
https://developer.confluent.io/tutorials
Apache License 2.0
14 stars 89 forks source link

Join two streams (KStreams) #28

Open tlberglund opened 5 years ago

gAmUssA commented 5 years ago

@MichaelDrogalis @tlberglund any preference what type of join we should tackle here?

I'm thinking about use case when we need to join two transactions that are coming from two streams (say Electronics purchase and Clothes purchase within 20 min (for simplicity i can simulate a smaller window)) and result third stram (issues of a coupon for a free cup of coffee). This is going to be the example of innner join - coupon will be issued only if both purchases were completed within the time window. How does it sound? cc @riferrei

MichaelDrogalis commented 5 years ago

That sounds great to me. For controlling time, you can use whatever window size makes sense for the example and use event-time to make it all work.

riferrei commented 5 years ago

@gAmUssA instead of coming up with a new domain model, why don't we reuse the same one for the other recipes? For example, it would make more sense (to me at least) to make the Join 2 Streams KSQL similar to the Join 2 Tables KSQL recipe. That provides consistency throughout the recipes and allows users evaluating the recipes to simpler understand the main differences -- since the domain model is the same.

Thoughts?

MichaelDrogalis commented 5 years ago

I think we’re on the verge of overusing our running example of movies. If I were a reader, I’d start to get fatigued after seeing it a couple of times. As long as the example scenario is simple and relatable, I don’t think we gain a lot by holding that constant.

On Wednesday, July 24, 2019, Ricardo Ferreira notifications@github.com wrote:

@gAmUssA https://github.com/gAmUssA instead of coming up with a new domain model, why don't we reuse the same one for the other recipes? For example, it would make more sense (to me at least) to make the Join 2 Streams similar to the Join 2 Tables recipe. That provides consistency throughout the recipes and allows users evaluating the recipes to simpler understand the main differences -- since the domain model is the same.

Thoughts?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/confluentinc/kafka-recipes/issues/28?email_source=notifications&email_token=AAHAKSLLXYRBCQQEJJC5YH3QBBGDXA5CNFSM4HXTETM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2WHP4A#issuecomment-514619376, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHAKSMNPLBFHRDINPYNK4TQBBGDZANCNFSM4HXTETMQ .

gAmUssA commented 5 years ago

Personally I'm with @MichaelDrogalis on that. We shouldn't limit ourselves with 1-2 domain model. Trying to fit existing domain model for the sake of not using new domain model creates more unnatural use cases. Also, frankly, I was struggling how to fit S2s join in to movie/rating model 😃

riferrei commented 5 years ago

Well... I respectfully disagree with that approach -- but in favor of the collective effort, I won't be against it. Gonna build a new example for Join 2 Streams KSQL then.