redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.47k stars 579 forks source link

rebalance issue when using c++ client librdkafka #4943

Open barcahead opened 2 years ago

barcahead commented 2 years ago

I have a standalone redpanda deployed and created a topic with 16 partition. I started two consumer using librdkafka and they were competing with each other to get all the partitions. Also I tested with kafkajs, it works fine, each consumer got 6 partitions.

Redpanda version: v22.1.3 (rev d1a4e00) librdkafka: install by running 'apt install librdkafka-dev'

some logs: RebalanceCb: Local: Assign partitions: test[0], test[1], test[2], test[3], test[4], test[5], test[6], test[7], test[8], test[9], test[10], test[11], test[12], test[13], test[14], test[15], RebalanceCb: Local: Revoke partitions: test[0], test[1], test[2], test[3], test[4], test[5], test[6], test[7], test[8], test[9], test[10], test[11], test[12], test[13], test[14], test[15], 2022-05-26 20:06:45.682: LOG-7-CGRPMETADATA: [thrd:main]: consumer join: metadata for subscription is up to date (2368ms old) 2022-05-26 20:06:45.832: LOG-7-METADATA: [thrd:main]: 112.48.179.18:9922/0: Request metadata for 1 topic(s): partition assignor 2022-05-26 20:06:45.886: LOG-7-METADATA: [thrd:main]: 112.48.179.18:9922/0: ===== Received metadata (for 1 requested topics): partition assignor ===== 2022-05-26 20:06:45.887: LOG-7-METADATA: [thrd:main]: 112.48.179.18:9922/0: ClusterId: , ControllerId: 0 2022-05-26 20:06:45.887: LOG-7-METADATA: [thrd:main]: 112.48.179.18:9922/0: 1 brokers, 1 topics 2022-05-26 20:06:45.887: LOG-7-METADATA: [thrd:main]: 112.48.179.18:9922/0: cluster:(null)(), white list add broker, host:112.48.179.18, port:9922 2022-05-26 20:06:45.887: LOG-7-METADATA: [thrd:main]: 112.48.179.18:9922/0: Broker #0/1: 112.48.179.18:9922 NodeId 0 2022-05-26 20:06:45.887: LOG-7-METADATA: [thrd:main]: 112.48.179.18:9922/0: Topic #0/1: test with 16 partitions 2022-05-26 20:06:45.887: LOG-7-METADATA: [thrd:main]: Topic test partition 0 Leader 0 2022-05-26 20:06:45.887: LOG-7-METADATA: [thrd:main]: Topic test partition 1 Leader 0 2022-05-26 20:06:45.887: LOG-7-METADATA: [thrd:main]: Topic test partition 2 Leader 0 2022-05-26 20:06:45.887: LOG-7-METADATA: [thrd:main]: Topic test partition 3 Leader 0 2022-05-26 20:06:45.887: LOG-7-METADATA: [thrd:main]: Topic test partition 4 Leader 0 2022-05-26 20:06:45.887: LOG-7-METADATA: [thrd:main]: Topic test partition 5 Leader 0 2022-05-26 20:06:45.887: LOG-7-METADATA: [thrd:main]: Topic test partition 6 Leader 0 2022-05-26 20:06:45.887: LOG-7-METADATA: [thrd:main]: Topic test partition 7 Leader 0 2022-05-26 20:06:45.887: LOG-7-METADATA: [thrd:main]: Topic test partition 8 Leader 0 2022-05-26 20:06:45.887: LOG-7-METADATA: [thrd:main]: Topic test partition 9 Leader 0 2022-05-26 20:06:45.887: LOG-7-METADATA: [thrd:main]: Topic test partition 10 Leader 0 2022-05-26 20:06:45.887: LOG-7-METADATA: [thrd:main]: Topic test partition 11 Leader 0 2022-05-26 20:06:45.887: LOG-7-METADATA: [thrd:main]: Topic test partition 12 Leader 0 2022-05-26 20:06:45.887: LOG-7-METADATA: [thrd:main]: Topic test partition 13 Leader 0 2022-05-26 20:06:45.887: LOG-7-METADATA: [thrd:main]: Topic test partition 14 Leader 0 2022-05-26 20:06:45.887: LOG-7-METADATA: [thrd:main]: Topic test partition 15 Leader 0 RebalanceCb: Local: Assign partitions: test[0], test[1], test[2], test[3], test[4], test[5], test[6], test[7], test[8], test[9], test[10], test[11], test[12], test[13], test[14], test[15], RebalanceCb: Local: Revoke partitions: test[0], test[1], test[2], test[3], test[4], test[5], test[6], test[7], test[8], test[9], test[10], test[11], test[12], test[13], test[14], test[15], 2022-05-26 20:06:46.635: LOG-7-CGRPMETADATA: [thrd:main]: consumer join: metadata for subscription is up to date (1927ms old) 2022-05-26 20:06:46.786: LOG-7-METADATA: [thrd:main]: 112.48.179.18:9922/0: Request metadata for 1 topic(s): partition assignor 2022-05-26 20:06:46.840: LOG-7-METADATA: [thrd:main]: 112.48.179.18:9922/0: ===== Received metadata (for 1 requested topics): partition assignor ===== 2022-05-26 20:06:46.840: LOG-7-METADATA: [thrd:main]: 112.48.179.18:9922/0: ClusterId: , ControllerId: 0 2022-05-26 20:06:46.840: LOG-7-METADATA: [thrd:main]: 112.48.179.18:9922/0: 1 brokers, 1 topics 2022-05-26 20:06:46.840: LOG-7-METADATA: [thrd:main]: 112.48.179.18:9922/0: cluster:(null)(), white list add broker, host:112.48.179.18, port:9922 2022-05-26 20:06:46.840: LOG-7-METADATA: [thrd:main]: 112.48.179.18:9922/0: Broker #0/1: 112.48.179.18:9922 NodeId 0 2022-05-26 20:06:46.840: LOG-7-METADATA: [thrd:main]: 112.48.179.18:9922/0: Topic #0/1: test with 16 partitions 2022-05-26 20:06:46.840: LOG-7-METADATA: [thrd:main]: Topic test partition 0 Leader 0 2022-05-26 20:06:46.840: LOG-7-METADATA: [thrd:main]: Topic test partition 1 Leader 0 2022-05-26 20:06:46.840: LOG-7-METADATA: [thrd:main]: Topic test partition 2 Leader 0 2022-05-26 20:06:46.840: LOG-7-METADATA: [thrd:main]: Topic test partition 3 Leader 0 2022-05-26 20:06:46.840: LOG-7-METADATA: [thrd:main]: Topic test partition 4 Leader 0 2022-05-26 20:06:46.840: LOG-7-METADATA: [thrd:main]: Topic test partition 5 Leader 0 2022-05-26 20:06:46.840: LOG-7-METADATA: [thrd:main]: Topic test partition 6 Leader 0 2022-05-26 20:06:46.840: LOG-7-METADATA: [thrd:main]: Topic test partition 7 Leader 0 2022-05-26 20:06:46.840: LOG-7-METADATA: [thrd:main]: Topic test partition 8 Leader 0 2022-05-26 20:06:46.840: LOG-7-METADATA: [thrd:main]: Topic test partition 9 Leader 0 2022-05-26 20:06:46.840: LOG-7-METADATA: [thrd:main]: Topic test partition 10 Leader 0 2022-05-26 20:06:46.840: LOG-7-METADATA: [thrd:main]: Topic test partition 11 Leader 0 2022-05-26 20:06:46.840: LOG-7-METADATA: [thrd:main]: Topic test partition 12 Leader 0 2022-05-26 20:06:46.840: LOG-7-METADATA: [thrd:main]: Topic test partition 13 Leader 0 2022-05-26 20:06:46.840: LOG-7-METADATA: [thrd:main]: Topic test partition 14 Leader 0 2022-05-26 20:06:46.840: LOG-7-METADATA: [thrd:main]: Topic test partition 15 Leader 0 RebalanceCb: Local: Assign partitions: test[0], test[1], test[2], test[3], test[4], test[5], test[6], test[7], test[8], test[9], test[10], test[11], test[12], test[13], test[14], test[15], ^C2022-05-26 20:06:47.215: LOG-7-SUBSCRIPTION: [thrd:main]: Group "d9999": effective subscription list changed from 1 to 0 topic(s): RebalanceCb: Local: Revoke partitions: test[0], test[1], test[2], test[3], test[4], test[5], test[6], test[7], test[8], test[9], test[10], test[11], test[12], test[13], test[14], test[15], 2022-05-26 20:06:47.216: LOG-7-DESTROY: [thrd:app]: Terminating instance 2022-05-26 20:06:47.216: LOG-7-DESTROY: [thrd:main]: Destroy internal 2022-05-26 20:06:47.216: LOG-7-DESTROY: [thrd:main]: Removing all topics RebalanceCb: Local: Revoke partitions: test[0], test[1], test[2], test[3], test[4], test[5], test[6], test[7], test[8], test[9], test[10], test[11], test[12], test[13], test[14], test[15], 2022-05-26 20:06:47.304: LOG-7-CGRPMETADATA: [thrd:main]: consumer join: metadata for subscription is up to date (1417ms old)

JIRA Link: CORE-927

twmb commented 2 years ago

Is this reproducible, and does this happen on Kafka proper? Rebalancing is mostly entirely driven by the clients themselves.

barcahead commented 2 years ago

Is this reproducible, and does this happen on Kafka proper? Rebalancing is mostly entirely driven by the clients themselves.

Yes, it is reproducible for redpanda + librdkafka. I tested redpanda + kafkajs and kafka + librdkafka, and both works fine.

I use https://hub.docker.com/r/bitnami/kafka/ for the kafka testing.