Open eelyaj opened 2 years ago
There are lots of io event write in thread 'KafkaProducer'. I don't know if it is normal.
Thread 16 "KafkaProducer" hit Breakpoint 1, 0x00007ffff6553fb0 in write () from /usr/lib64/libpthread.so.0 (gdb) bt
I think this might be a dup of https://github.com/edenhill/librdkafka/issues/3538
I'm working on an improved wakeup mechanism for 1.9.
Thanks. Is there any workround I can do to avoid this issue, any config or parameters? or maybe I can roll back librdkafka version in my system, which version should I use?
Test with 100 partitions , 3 brokers, 4 vcpu, 16G men, 60 bytes packet.
version | throughput(packets/second) |
---|---|
1.8.2 | 800,000 |
1.7.0 | 800,000 |
1.6.1 | 1,000,000 |
1.5.3 | 790,000 |
1.4.4 | 790,000 |
1.3.0 | 760,000 |
1.2.2 | 790,000 |
1.1.0 | 790,000 |
0.11.6 | 1,020,000 |
@eelyaj Is this still an issue?
Closing as the fix is merged already. Feel free to reopen if you still see the issue.
Description
I have a 100 partiation topic in my system. I found that when there are 3 kafka brokers, librdkafka can send 1,500,000 packets per second to kafka. But when I increase broker number from 3 to 20, librdkafka can only send 68,0000 packets per second.
I use rd_kafka_produce_batch api in my producer, with parmeters partition=RD_KAFKA_PARTITION_UA, msgflags=RD_KAFKA_MSG_F_COPY, message_cnt= 1000.
The 'top' cpu output is like: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7435 root 21 1 1880768 1.2g 7744 R 59.7 3.7 9:58.51 KafkaProducer 7453 root 22 2 1880768 1.2g 7744 R 33.0 3.7 5:17.44 Serializer2 7452 root 22 2 1880768 1.2g 7744 R 32.7 3.7 5:16.51 Serializer1 7451 root 22 2 1880768 1.2g 7744 R 31.7 3.7 5:12.27 Serializer0 7018 root 20 0 1880768 1.2g 7744 S 13.2 3.7 1:56.81 rdk:broker15000 7021 root 20 0 1880768 1.2g 7744 S 10.2 3.7 1:39.75 rdk:broker15000 7027 root 20 0 1880768 1.2g 7744 S 9.2 3.7 1:36.97 rdk:broker15000 7454 root 23 3 1880768 1.2g 7744 R 8.6 3.7 1:20.87 UdpDispatch 7025 root 20 0 1880768 1.2g 7744 S 8.3 3.7 1:29.82 rdk:broker15000 7033 root 20 0 1880768 1.2g 7744 S 8.3 3.7 1:36.80 rdk:broker15000 7019 root 20 0 1880768 1.2g 7744 S 7.9 3.7 1:07.05 rdk:broker15000 7030 root 20 0 1880768 1.2g 7744 S 7.9 3.7 1:18.41 rdk:broker15000 7020 root 20 0 1880768 1.2g 7744 R 7.6 3.7 1:13.25 rdk:broker15000 7036 root 20 0 1880768 1.2g 7744 R 7.6 3.7 0:55.21 rdk:broker15000 7024 root 20 0 1880768 1.2g 7744 S 6.9 3.7 0:55.41 rdk:broker15000 7035 root 20 0 1880768 1.2g 7744 S 6.9 3.7 1:02.24 rdk:broker15000 7028 root 20 0 1880768 1.2g 7744 S 6.3 3.7 0:55.50 rdk:broker15000 7034 root 20 0 1880768 1.2g 7744 S 5.6 3.7 0:54.06 rdk:broker15000 7026 root 20 0 1880768 1.2g 7744 S 5.3 3.7 0:42.09 rdk:broker15000
The 'KafkaProducer' thread is where I use rd_kafka_produce_batch api to send message to kafka. I also use 'perf top' to show cpu info of 'KafkaProducer': Samples: 17K of event 'cycles', Event count (approx.): 2675618158 lost: 0/0 Children Self Shared Object Symbol
the ‘write’ system calls are in the top list.
How to reproduce
Deploy large nubmer of kafka broker, it reproduce everytime in my system. I've tried 1.1.0, 1.7.0 version.
Checklist
Please provide the following information:
1.7.0
2.2.1
<REPLACE with e.g., message.timeout.ms=123, auto.reset.offset=earliest, ..>
"queue.buffering.max.ms": "1000", "queue.buffering.max.messages": "1000000", "queue.buffering.max.kbytes": "2048000", "batch.num.messages": "10000", "compression.codec": "zstd", "socket.send.buffer.bytes": "3200000", "socket.receive.buffer.bytes": "3200000", "message.max.bytes": "209715200" "request.required.acks": "1", "request.timeout.ms": "30000", "partitioner": "murmur2_random" "replication_factor": 2,EulerOS 2.7
the same as CentOSdebug=..
as necessary) from librdkafka