confluentinc / librdkafka

The Apache Kafka C/C++ library
Other
284 stars 3.15k forks source link

Segfault when destroying producer #4330

Open vbmithr opened 1 year ago

vbmithr commented 1 year ago

Description

Program terminated with signal SIGABRT, Aborted.
#0  0x00007fe12277026c in ?? () from /usr/lib/libc.so.6
[Current thread is 1 (Thread 0x7fe1221e3440 (LWP 1678161))]
(gdb) bt
#0  0x00007fe12277026c in ?? () from /usr/lib/libc.so.6
#1  0x00007fe122720a08 in raise () from /usr/lib/libc.so.6
#2  0x00007fe122709538 in abort () from /usr/lib/libc.so.6
#3  0x00007fe12270a2db in ?? () from /usr/lib/libc.so.6
#4  0x00007fe122719227 in ?? () from /usr/lib/libc.so.6
#5  0x00007fe1227768a1 in ?? () from /usr/lib/libc.so.6
#6  0x00007fe1227711ff in ?? () from /usr/lib/libc.so.6
#7  0x00007fe122776f4d in mtx_lock () from /usr/lib/libc.so.6
#8  0x00007fe122a0e874 in rd_kafka_q_disable0 (do_lock=1, rkq=0x55d7ed6b9a80)
    at /home/vb/code/aur/librdkafka/trunk/src/librdkafka-2.1.1/src/rdkafka_queue.h:193
#9  rd_kafka_q_destroy0 (disable=1, rkq=0x55d7ed6b9a80)
    at /home/vb/code/aur/librdkafka/trunk/src/librdkafka-2.1.1/src/rdkafka_queue.h:221
#10 rd_kafka_q_destroy_owner (rkq=0x55d7ed6b9a80)
    at /home/vb/code/aur/librdkafka/trunk/src/librdkafka-2.1.1/src/rdkafka_queue.h:247
#11 rd_kafka_destroy_final (rk=0x55d7ed6e2b70) at rdkafka.c:954
#12 0x00007fe122a100bb in rd_kafka_destroy_app (rk=0x55d7ed6e2b70, flags=<optimized out>, flags@entry=0) at rdkafka.c:1115
#13 0x00007fe122a1047b in rd_kafka_destroy (rk=<optimized out>) at rdkafka.c:1122

Checklist

IMPORTANT: We will close issues where the checklist has not been completed.

Please provide the following information:

mensfeld commented 1 year ago

Do you use sticky cooperative rebalance strategy?

vbmithr commented 1 year ago

No. But I've found why it happens. It happens when I use rd_kafka_queue_event_enable and I close the provided fd before calling rd_kafka_destroy.

So I can fix this for my use case but I guess it should not segfault anyway.

emasab commented 1 year ago

This seem to be caused by rk->rk_rep being already destroyed (rdkafka.c:954). Could you provide some example code for reproducing this?

nhaq-confluent commented 9 months ago

@vbmithr Wanted to ping you on the last request by @emasab. If this issue has been resolved or you found a fix, we can close this issue