confluentinc / librdkafka

The Apache Kafka C/C++ library
Other
294 stars 3.15k forks source link

Key given in producer through librdkafka mismatches to consume on right nodejs consumer #3985

Closed simhan95 closed 2 years ago

simhan95 commented 2 years ago

Read the FAQ first: https://github.com/edenhill/librdkafka/wiki/FAQ

Do NOT create issues for questions, use the discussion forum: https://github.com/edenhill/librdkafka/discussions

Description

The c++ producer using librdkafka producing messages on a topic name and key, is produced to a partition which is not same as a nodejs producer (kafka library) producing on the same topic and key.

We see, consumers written in nodejs (kafkajs library) consuming messages on different partition from messages produced from c++ librdkafka and from nodejs producer ( through kafkajs).

How to reproduce

  1. We use latest librdkafka version from git repo, built in linux and produce messages on a particular topic.
  2. This c++ producer produces on a topic called "voice-test" and key called "testkey" (for example)
  3. Then through nodejs, another producer (written in library kafkajs) sends messages to kafak on same topic "voice-test" and same key "testkey"
  4. Now, I start two consumers in nodejs (through kafkajs library) to consume messages on the topic "vocie-test" whereas c++ produced messages are consumed in consumer 1 and nodejs producer is consumed in consumer 2
  5. Ideally, we expect one consumer to consume all the messages on voice-test from same key "testkey" but we see how c++ client sends the message on different partition is conflicting with nodejs producer sending to different partition

How we resolved this:

  1. Either use node-rdkadka nodejs library to produce which is compatible in terms of key allocation to right partition with c++ librakafka producer.
  2. Another old approach is, we made nodejs producer to modify the key with ENCODED DOUBLE QUOTES '\"testkey\"' which makes it compatiable with c++ producer logic.

Checklist

IMPORTANT: We will close issues where the checklist has not been completed.

Please provide the following information:

edenhill commented 2 years ago

You need to have compatible partitioners in all producers producing to the same topic.

librdkafka has, for historical reasons, a non-java-compatible default partitioner, but you can easily change to the java-compatible murmur2_random partitioner by configuring partitioner=murmur2_random. There's hopefully something similar for kafkajs.