lovoo / goka

Goka is a compact yet powerful distributed stream processing library for Apache Kafka written in Go.
BSD 3-Clause "New" or "Revised" License
2.35k stars 175 forks source link

[Production Usage] Question: How can I modify the state of the whole table triggered by an input message? #462

Open akshatraika-moment opened 1 week ago

akshatraika-moment commented 1 week ago

We are using Goka in production and have encountered a scenario where we need to clean up the state in a table topic when processing a specific message from our input topic. The use case involves two types of messages:

The challenge we’re facing is as follows:

What would be the recommended approach to handle this?

owenniles commented 1 week ago

I have this use case as well.

One difficulty I've encountered using p.VisitAll to perform cleanup upon receiving the kind of multi-key cleanup message @akshatraika-moment has described is that we don't want to block the process callback/partition processor until p.VisitAll returns, but we also don't want to commit the cleanup message's offset until p.VisitAll returns.

If we block the processor, we find ourselves in a deadlock. p.VisitAll does not return until all of the visit events have been processed, but the some of the visit events will never be processed because one of the partition processors is blocked on p.VisitAll.

But if we don't block the processor callback/partition processor, then we lose at least once semantics. If the processor shuts down or rebalances after the multi-key cleanup message's offset is committed but before we've finished cleaning up, some of the visit events are lost forever.

We've considered using ctx.DeferCommit to maintain at least once semantics, but are concerned we would build up a large queue of uncommitted offsets. The idea would be to wait for p.VisitAll to return before committing any offsets after receiving a multi-key cleanup message. The concern about building up a large queue of uncommitted offsets comes from the fact that we have high throughput and a lot of unique keys on our input topic.

We would greatly appreciate any advice! Thank you.