lovoo / goka

Goka is a compact yet powerful distributed stream processing library for Apache Kafka written in Go.
BSD 3-Clause "New" or "Revised" License
2.36k stars 173 forks source link

Thought experiment: Librdkafka-based lookup table producers #456

Open owenniles opened 3 months ago

owenniles commented 3 months ago

Suppose I have a librdkafka-based producer that writes to a compacted topic. I want to add this compacted topic to my processor as a lookup edge. Does Goka currently support this use case?

Why couldn't Goka currently support this use case? Within the processor, lookup tables are implemented as views. Based on my understanding of the code, when a view's Get method is called to look up the state for a particular a key, the view hashes the key and converts the digest to the number of the partition in which it expects to find the state. Then, the view queries the partition table responsible for that partition. But Goka and librdkafka have subtly different ways of converting hashes to partition numbers, so the view may end up looking for the state in the wrong partition table. Herein lies the problem.

In other words, even given the same key and hashing algorithm, a librdkafka-based producer and view may come up with different partition numbers. This difference in behavior can be explained by the fact that Goka treats the key digest as a signed 32-bit integer (see code snippet below), while librdkafka treats it as an unsigned 32-bit integer.

https://github.com/lovoo/goka/blob/e6a3579e783ed7a6120f3b176bb397186bcde062/view.go#L307