uperl / Kafka-Librd

perl bindings to librdkafka
11 stars 8 forks source link

Possible thread race condition #7

Open weberbiz opened 7 years ago

weberbiz commented 7 years ago

Hello, We've been trying to get version 0.08 Kafka::Librd to work through Alien-Librdkafka-v0.9.5 to a 6-node Kafka cluster. An apparent race condition in the librdkafka library emerges with broker lists > 1 in length is used. We happen to be using perl 5.16.3, but verified the problem in other versions as well, in both 32-bit and 64-bit versions. Sometime perl dies with a segmentation violation, or it may produce a "perl: double free or corruption" with a stack dump (see below). The race condition appears to occur between the Kafka::Librd->new() and $kafka_librd->brokers_add() calls. Putting a sleep(10) between the calls, for example, seems to mitigate the problem somewhat. Increasing the number of brokers increases the probability of a crash.
The reason we're contacting you instead of Alien::Librdkafka is that we could not reproduce the problem with their examples/rdkafka_example program, which basically is doing the same things as your module. We could not capture the problem with the gdb debugger, as the debugger must be doing something to control the operation of the threads.
We were wondering if you're already aware of the issue and are addressing it. If not, we may try to rework the Rdkafka.xs to see if we can make it match the behavior of the rdkafka_example c program, which works flawlessly.

Thanks for your efforts! D Weber Ticketmaster.com

Stack trace from perl: glibc detected perl: double free or corruption (!prev): 0x0000000003048ed0 � �� �� glibc detected perl: corrupted double-linked list: 0x0000000003048ec0 ======= Backtrace: ========= /lib64/libc.so.6[0x2ba50d8964af] /lib64/libc.so.6(cfree+0x4b)[0x2ba50d89a7ab] /lib64/libc.so.6(fclose+0x14b)[0x2ba50d884d5b] /lib64/libnss_files.so.2(_nss_files_gethostbyname2_r+0x190)[0x2ba519721bd0] /lib64/libc.so.6[0x2ba50d8e176b] /lib64/libc.so.6(getaddrinfo+0x21a)[0x2ba50d8e39ba] /app/shared/lib/cpan64/lib/perl5/auto/share/dist/Alien-Librdkafka/lib/librdkafka.so.1[0x2ba512226ef4] /app/shared/lib/cpan64/lib/perl5/auto/share/dist/Alien-Librdkafka/lib/librdkafka.so.1[0x2ba5121f6204] /app/shared/lib/cpan64/lib/perl5/auto/share/dist/Alien-Librdkafka/lib/librdkafka.so.1[0x2ba512227ed0] /lib64/libpthread.so.0[0x2ba51269d83d] /lib64/libc.so.6[0x2ba50d8e1bbc] /lib64/libc.so.6(getaddrinfo+0x21a)[0x2ba50d8e39ba] /lib64/libc.so.6(clone+0x6d)[0x2ba50d8fa18d] ======= Memory map: ======== 00400000-00402000 r-xp 00000000 00:15 35346834 /software/x64/perl/5.16.3/bin/perl 00601000-00602000 rw-p 00001000 00:15 35346834 /software/x64/perl/5.16.3/bin/perl 0296e000-03054000 rw-p 0296e000 00:00 0 2ba50c7b4000-2ba50c7d0000 r-xp 00000000 fd:00 713155 /lib64/ld-2.5.so 2ba50c7d0000-2ba50c7d2000 rw-p 2ba50c7d0000 00:00 0 2ba50c9d0000-2ba50c9d1000 r--p 0001c000 fd:00 713155 /lib64/ld-2.5.so 2ba50c9d1000-2ba50c9d2000 rw-p 0001d000 fd:00 713155 /lib64/ld-2.5.so 2ba50c9d2000-2ba50cb33000 r-xp 00000000 00:15 103262105 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/CORE/libperl.so 2ba50cb33000-2ba50cd33000 ---p 00161000 00:15 103262105 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/CORE/libperl.so 2ba50cd33000-2ba50cd3e000 rw-p 00161000 00:15 103262105 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/CORE/libperl.so 2ba50cd49000-2ba50cd5e000 r-xp 00000000 fd:00 713172 /lib64/libnsl-2.5.so 2ba50cd5e000-2ba50cf5d000 ---p 00015000 fd:00 713172 /lib64/libnsl-2.5.so 2ba50cf5d000-2ba50cf5e000 r--p 00014000 fd:00 713172 /lib64/libnsl-2.5.so 2ba50cf5e000-2ba50cf5f000 rw-p 00015000 fd:00 713172 /lib64/libnsl-2.5.so 2ba50cf5f000-2ba50cf62000 rw-p 2ba50cf5f000 00:00 0 2ba50cf62000-2ba50cf64000 r-xp 00000000 fd:00 713168 /lib64/libdl-2.5.so 2ba50cf64000-2ba50d164000 ---p 00002000 fd:00 713168 /lib64/libdl-2.5.so 2ba50d164000-2ba50d165000 r--p 00002000 fd:00 713168 /lib64/libdl-2.5.so 2ba50d165000-2ba50d166000 rw-p 00003000 fd:00 713168 /lib64/libdl-2.5.so 2ba50d166000-2ba50d1e8000 r-xp 00000000 fd:00 713170 /lib64/libm-2.5.so 2ba50d1e8000-2ba50d3e7000 ---p 00082000 fd:00 713170 /lib64/libm-2.5.so 2ba50d3e7000-2ba50d3e8000 r--p 00081000 fd:00 713170 /lib64/libm-2.5.so 2ba50d3e8000-2ba50d3e9000 rw-p 00082000 fd:00 713170 /lib64/libm-2.5.so 2ba50d3e9000-2ba50d3f2000 r-xp 00000000 fd:00 713166 /lib64/libcrypt-2.5.so 2ba50d3f2000-2ba50d5f1000 ---p 00009000 fd:00 713166 /lib64/libcrypt-2.5.so 2ba50d5f1000-2ba50d5f2000 r--p 00008000 fd:00 713166 /lib64/libcrypt-2.5.so 2ba50d5f2000-2ba50d5f3000 rw-p 00009000 fd:00 713166 /lib64/libcrypt-2.5.so 2ba50d5f3000-2ba50d622000 rw-p 2ba50d5f3000 00:00 0 2ba50d622000-2ba50d624000 r-xp 00000000 fd:00 713194 /lib64/libutil-2.5.so 2ba50d624000-2ba50d823000 ---p 00002000 fd:00 713194 /lib64/libutil-2.5.so 2ba50d823000-2ba50d824000 r--p 00001000 fd:00 713194 /lib64/libutil-2.5.so 2ba50d824000-2ba50d825000 rw-p 00002000 fd:00 713194 /lib64/libutil-2.5.so 2ba50d825000-2ba50d974000 r-xp 00000000 fd:00 713162 /lib64/libc-2.5.so 2ba50d974000-2ba50db74000 ---p 0014f000 fd:00 713162 /lib64/libc-2.5.so 2ba50db74000-2ba50db78000 r--p 0014f000 fd:00 713162 /lib64/libc-2.5.so 2ba50db78000-2ba50db79000 rw-p 00153000 fd:00 713162 /lib64/libc-2.5.so 2ba50db79000-2ba50db80000 rw-p 2ba50db79000 00:00 0 2ba50db80000-2ba51114f000 r--p 00000000 fd:00 1337999 /usr/lib/locale/locale-archive 2ba51114f000-2ba511153000 r-xp 00000000 00:15 19647616 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/IO/IO.so 2ba511153000-2ba511352000 ---p 00004000 00:15 19647616 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/IO/IO.so 2ba511352000-2ba511353000 rw-p 00003000 00:15 19647616 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/IO/IO.so 2ba511353000-2ba51135a000 r-xp 00000000 00:15 103619381 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/Socket/Socket.so 2ba51135a000-2ba511559000 ---p 00007000 00:15 103619381 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/Socket/Socket.so 2ba511559000-2ba51155b000 rw-p 00006000 00:15 103619381 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/Socket/Socket.so 2ba51155b000-2ba511568000 r-xp 00000000 00:15 133225988 /app/shared/lib/cpan64/lib/perl5/x86_64-linux/auto/JSON/XS/XS.so 2ba511568000-2ba511767000 ---p 0000d000 00:15 133225988 /app/shared/lib/cpan64/lib/perl5/x86_64-linux/auto/JSON/XS/XS.so 2ba511767000-2ba511768000 rw-p 0000c000 00:15 133225988 /app/shared/lib/cpan64/lib/perl5/x86_64-linux/auto/JSON/XS/XS.so 2ba511768000-2ba51176a000 r-xp 00000000 00:15 16627297 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/attributes/attributes.so 2ba51176a000-2ba511969000 ---p 00002000 00:15 16627297 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/attributes/attributes.so 2ba511969000-2ba51196a000 rw-p 00001000 00:15 16627297 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/attributes/attributes.so 2ba51196a000-2ba5119b9000 r-xp 00000000 00:15 19550974 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/re/re.so 2ba5119b9000-2ba511bb9000 ---p 0004f000 00:15 19550974 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/re/re.so 2ba511bb9000-2ba511bba000 rw-p 0004f000 00:15 19550974 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/re/re.so 2ba511bba000-2ba511bc0000 r-xp 00000000 00:15 38262731 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/List/Util/Util.so 2ba511bc0000-2ba511dbf000 ---p 00006000 00:15 38262731 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/List/Util/Util.so 2ba511dbf000-2ba511dc0000 rw-p 00005000 00:15 38262731 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/List/Util/Util.so 2ba511dc0000-2ba511dc8000 r-xp 00000000 00:15 24110830 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/Encode/Encode.so 2ba511dc8000-2ba511fc7000 ---p 00008000 00:15 24110830 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/Encode/Encode.so 2ba511fc7000-2ba511fc8000 rw-p 00007000 00:15 24110830 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/Encode/Encode.so 2ba511fc8000-2ba511fd0000 r-xp 00000000 00:15 8079130 /app/shared/lib/cpan64/lib/perl5/x86_64-linux/auto/Kafka/Librd/Librd.so 2ba511fd0000-2ba5121cf000 ---p 00008000 00:15 8079130 /app/shared/lib/cpan64/lib/perl5/x86_64-linux/auto/Kafka/Librd/Librd.so 2ba5121cf000-2ba5121d0000 rw-p 00007000 00:15 8079130 /app/shared/lib/cpan64/lib/perl5/x86_64-linux/auto/Kafka/Librd/Librd.so 2ba5121d0000-2ba512260000 r-xp 00000000 00:15 3604540 /app/shared/lib/cpan64/lib/perl5/auto/share/dist/Alien-Librdkafka/lib/librdkafka.so.1 2ba512260000-2ba512460000 ---p 00090000 00:15 3604540 /app/shared/lib/cpan64/lib/perl5/auto/share/dist/Alien-Librdkafka/lib/librdkafka.so.1 2ba512460000-2ba51246d000 rw-p 00090000 00:15 3604540 /app/shared/lib/cpan64/lib/perl5/auto/share/dist/Alien-Librdkafka/lib/librdkafka.so.1 2ba51246d000-2ba512489000 r-xp 00000000 00:15 3604536 /app/shared/lib/cpan64/lib/perl5/auto/share/dist/Alien-Librdkafka/lib/librdkafka++.so.1 2ba512489000-2ba512688000 ---p 0001c000 00:15 3604536 /app/shared/lib/cpan64/lib/perl5/auto/share/dist/Alien-Librdkafka/lib/librdkafka++.so.1 2ba512688000-2ba51268c000 rw-p 0001b000 00:15 3604536 /app/shared/lib/cpan64/lib/perl5/auto/share/dist/Alien-Librdkafka/lib/librdkafka++.so.1 2ba512697000-2ba5126ad000 r-xp 00000000 fd:00 713186 /lib64/libpthread-2.5.so 2ba5126ad000-2ba5128ad000 ---p 00016000 fd:00 713186 /lib64/libpthread-2.5.so 2ba5128ad000-2ba5128ae000 r--p 00016000 fd:00 713186 /lib64/libpthread-2.5.so 2ba5128ae000-2ba5128af000 rw-p 00017000 fd:00 713186 /lib64/libpthread-2.5.so 2ba5128af000-2ba5128b3000 rw-p 2ba5128af000 00:00 0 2ba5128b3000-2ba5128c7000 r-xp 00000000 fd:00 713278 /lib64/libz.so.1.2.3 2ba5128c7000-2ba512ac6000 ---p 00014000 fd:00 713278 /lib64/libz.so.1.2.3 2ba512ac6000-2ba512ac7000 rw-p 00013000 fd:00 713278 /lib64/libz.so.1.2.3 2ba512ac7000-2ba512bf4000 r-xp 00000000 fd:00 713277 /lib64/libcrypto.so.0.9.8e 2ba512bf4000-2ba512df3000 ---p 0012d000 fd:00 713277 /lib64/libcrypto.so.0.9.8e 2ba512df3000-2ba512e14000 rw-p 0012c000 fd:00 713277 /lib64/libcrypto.so.0.9.8e 2ba512e14000-2ba512e18000 rw-p 2ba512e14000 00:00 0 2ba512e18000-2ba512e60000 r-xp 00000000 fd:00 713279 /lib64/libssl.so.0.9.8e 2ba512e60000-2ba513060000 ---p 00048000 fd:00 713279 /lib64/libssl.so.0.9.8e 2ba513060000-2ba513066000 rw-p 00048000 fd:00 713279 /lib64/libssl.so.0.9.8e 2ba513066000-2ba51306d000 r-xp 00000000 fd:00 713190 /lib64/librt-2.5.so 2ba51306d000-2ba51326d000 ---p 00007000 fd:00 713190 /lib64/librt-2.5.so 2ba51326d000-2ba51326e000 r--p 00007000 fd:00 713190 /lib64/librt-2.5.so 2ba51326e000-2ba51326f000 rw-p 00008000 fd:00 713190 /lib64/librt-2.5.so 2ba51326f000-2ba513355000 r-xp 00000000 fd:00 1333707 /usr/lib64/libstdc++.so.6.0.8 2ba513355000-2ba513554000 ---p 000e6000 fd:00 1333707 /usr/lib64/libstdc++.so.6.0.8 2ba513554000-2ba51355a000 r--p 000e5000 fd:00 1333707 /usr/lib64/libstdc++.so.6.0.8 2ba51355a000-2ba51355d000 rw-p 000eb000 fd:00 1333707 /usr/lib64/libstdc++.so.6.0.8 2ba51355d000-2ba51356f000 rw-p 2ba51355d000 00:00 0 2ba51356f000-2ba51357c000 r-xp 00000000 fd:00 713154 /lib64/libgcc_s-4.1.2-20080825.so.1 2ba51357c000-2ba51377c000 ---p 0000d000 fd:00 713154 /lib64/libgcc_s-4.1.2-20080825.so.1 2ba51377c000-2ba51377d000 rw-p 0000d000 fd:00 713154 /lib64/libgcc_s-4.1.2-20080825.so.1 2ba51377d000-2ba5137a9000 r-xp 00000000 fd:00 1334565 /usr/lib64/libgssapi_krb5.so.2.2 2ba5137a9000-2ba5139a9000 ---p 0002c000 fd:00 1334565 /usr/lib64/libgssapi_krb5.so.2.2 2ba5139a9000-2ba5139ab000 rw-p 0002c000 fd:00 1334565 /usr/lib64/libgssapi_krb5.so.2.2 2ba5139ab000-2ba513a3c000 r-xp 00000000 fd:00 1333319 /usr/lib64/libkrb5.so.3.3 2ba513a3c000-2ba513c3c000 ---p 00091000 fd:00 1333319 /usr/lib64/libkrb5.so.3.3 2ba513c3c000-2ba513c40000 rw-p 00091000 fd:00 1333319 /usr/lib64/libkrb5.so.3.3 2ba513c40000-2ba513c42000 r-xp 00000000 fd:00 713223 /lib64/libcom_err.so.2.1 2ba513c42000-2ba513e41000 ---p 00002000 fd:00 713223 /lib64/libcom_err.so.2.1 2ba513e41000-2ba513e42000 rw-p 00001000 fd:00 713223 /lib64/libcom_err.so.2.1 2ba513e42000-2ba513e66000 r-xp 00000000 fd:00 1334799 /usr/lib64/libk5crypto.so.3.1 2ba513e66000-2ba514065000 ---p 00024000 fd:00 1334799 /usr/lib64/libk5crypto.so.3.1 2ba514065000-2ba514067000 rw-p 00023000 fd:00 1334799 /usr/lib64/libk5crypto.so.3.1 2ba514067000-2ba51406f000 r-xp 00000000 fd:00 1335417 /usr/lib64/libkrb5support.so.0.1 2ba51406f000-2ba51426e000 ---p 00008000 fd:00 1335417 /usr/lib64/libkrb5support.so.0.1 2ba51426e000-2ba51426f000 rw-p 00007000 fd:00 1335417 /usr/lib64/libkrb5support.so.0.1 2ba51426f000-2ba514271000 r-xp 00000000 fd:00 713373 /lib64/libkeyutils-1.2.so 2ba514271000-2ba514470000 ---p 00002000 fd:00 713373 /lib64/libkeyutils-1.2.so 2ba514470000-2ba514471000 rw-p 00001000 fd:00 713373 /lib64/libkeyutils-1.2.so 2ba514471000-2ba514482000 r-xp 00000000 fd:00 713188 /lib64/libresolv-2.5.so 2ba514482000-2ba514682000 ---p 00011000 fd:00 713188 /lib64/libresolv-2.5.so 2ba514682000-2ba514683000 r--p 00011000 fd:00 713188 /lib64/libresolv-2.5.so 2ba514683000-2ba514684000 rw-p 00012000 fd:00 713188 /lib64/libresolv-2.5.so 2ba514684000-2ba514686000 rw-p 2ba514684000 00:00 0 2ba514686000-2ba51469b000 r-xp 00000000 fd:00 713271 /lib64/libselinux.so.1 2ba51469b000-2ba51489b000 ---p 00015000 fd:00 713271 /lib64/libselinux.so.1 2ba51489b000-2ba51489d000 rw-p 00015000 fd:00 713271 /lib64/libselinux.so.1 2ba51489d000-2ba51489e000 rw-p 2ba51489d000 00:00 0 2ba51489e000-2ba5148d9000 r-xp 00000000 fd:00 713215 /lib64/libsepol.so.1 2ba5148d9000-2ba514ad9000 ---p 0003b000 fd:00 713215 /lib64/libsepol.so.1 2ba514ad9000-2ba514ada000 rw-p 0003b000 fd:00 713215 /lib64/libsepol.so.1 2ba514ada000-2ba514ae4000 rw-p 2ba514ada000 00:00 0 2ba514ae4000-2ba514ae7000 r-xp 00000000 00:15 47521075 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/Fcntl/Fcntl.so 2ba514ae7000-2ba514ce7000 ---p 00003000 00:15 47521075 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/Fcntl/Fcntl.so 2ba514ce7000-2ba514ce8000 rw-p 00003000 00:15 47521075 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/Fcntl/Fcntl.so 2ba514ce8000-2ba514cf9000 r-xp 00000000 00:15 133980523 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/POSIX/POSIX.so 2ba514cf9000-2ba514ef9000 ---p 00011000 00:15 133980523 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/POSIX/POSIX.so 2ba514ef9000-2ba514efc000 rw-p 00011000 00:15 133980523 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/POSIX/POSIX.so 2ba514efc000-2ba514f03000 r-xp 00000000 00:15 17896113 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/Data/Dumper/Dumper.so 2ba514f03000-2ba515103000 ---p 00007000 00:15 17896113 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/Data/Dumper/Dumper.so 2ba515103000-2ba515104000 rw-p 00007000 00:15 17896113 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/auto/Data/Dumper/Dumper.so 2ba515104000-2ba515105000 ---p 2ba515104000 00:00 0 2ba515105000-2ba515b05000 rw-p 2ba515105000 00:00 0 2ba515b05000-2ba515b06000 ---p 2ba515b05000 00:00 0 2ba515b06000-2ba516506000 rw-p 2ba515b06000 00:00 0 2ba516506000-2ba51650d000 r--s 00000000 fd:00 1394145 /usr/lib64/gconv/gconv-modules.cache 2ba51650d000-2ba51650e000 rw-p 2ba51650d000 00:00 0 2ba51650e000-2ba51650f000 ---p 2ba51650e000 00:00 0 2ba51650f000-2ba516f0f000 rw-p 2ba51650f000 00:00 0 2ba516f0f000-2ba516f10000 ---p 2ba516f0f000 00:00 0 2ba516f10000-2ba517910000 rw-p 2ba516f10000 00:00 0 2ba517910000-2ba517911000 ---p 2ba517910000 00:00 0 2ba517911000-2ba518311000 rw-p 2ba517911000 00:00 0 2ba518311000-2ba518312000 ---p 2ba518311000 00:00 0 2ba518312000-2ba518d12000 rw-p 2ba518312000 00:00 0 2ba518d12000-2ba518d13000 ---p 2ba518d12000 00:00 0 2ba518d13000-2ba519714000 rw-p 2ba518d13000 00:00 0 2ba51971e000-2ba519728000 r-xp 00000000 fd:00 713178 /lib64/libnss_files-2.5.so 2ba519728000-2ba519927000 ---p 0000a000 fd:00 713178 /lib64/libnss_files-2.5.so 2ba519927000-2ba519928000 r--p 00009000 fd:00 713178 /lib64/libnss_files-2.5.so 2ba519928000-2ba519929000 rw-p 0000a000 fd:00 713178 /lib64/libnss_files-2.5.so 2ba519934000-2ba519938000 r-xp 00000000 fd:00 713176 /lib64/libnss_dns-2.5.so 2ba519938000-2ba519b37000 ---p 00004000 fd:00 713176 /lib64/libnss_dns-2.5.so 2ba519b37000-2ba519b38000 r--p 00003000 fd:00 713176 /lib64/libnss_dns-2.5.so 2ba519b38000-2ba519b39000 rw-p 00004000 fd:00 713176 /lib64/libnss_dns-2.5.so 2ba51c000000-2ba51c021000 rw-p 2ba51c000000 00:00 0 2ba51c021000-2ba520000000 ---p 2ba51c021000 00:00 0 7fffd9756000-7fffd976b000 rw-p 7ffffffe9000 00:00 0 [stack] 7fffd97fd000-7fffd9800000 r-xp 7fffd97fd000 00:00 0 [vdso] ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0 [vsyscall]

trinitum commented 7 years ago

Hi, I never had this issue. We using the module with perl 5.24 on linux on 3 node cluster and it works without problems. Can you provide me with a runnable script that exposes the problem for you? Also I would like to see perl -V output. I'll try to look into it, but unfortunately I don't have that much free time, so can't promise quick results. If you can investigate the problem yourself and create a merge request that would be awesome.

weberbiz commented 7 years ago

Hi Pavel, I’m attaching a small script that (eventually) generates a crash. I list the 6 brokers I’m testing with, so you’ll need to swap. I made it crash with 5, but not less. I made sure all nodes were accessible on port 6667 (this is how they were configured )

Here’s the perl –V:

Summary of my perl5 (revision 5 version 16 subversion 3) configuration:

Platform: osname=linux, osvers=2.6.18-238.9.1.el5xen, archname=x86_64-linux uname='linux bld1.sys.tools1.websys.tmcs 2.6.18-238.9.1.el5xen #1 smp tue apr 12 18:53:56 edt 2011 x86_64 x86_64 x86_64 gnulinux ' config_args='-des -Doptimize=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -Dversion=5.16.3 -Dmyhostname=localhost -Dperladmin=root@localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/software/x64/perl/5.16.3 -Dprefix=/software/x64/perl/5.16.3 -Darchname=x86_64-linux -Dvendorprefix=/software/x64/perl/5.16.3 -Dsiteprefix=/software/x64/perl/5.16.3 -Duseshrplib -Uusethreads -Duselargefiles -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl=n -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dd_gethostent_r_proto -Ud_endhostent_r_proto -Ud_sethostent_r_proto -Ud_endprotoent_r_proto -Ud_setprotoent_r_proto -Ud_endservent_r_proto -Ud_setservent_r_proto -Dscriptdir=/software/x64/perl/5.16.3/bin' hint=recommended, useposix=true, d_sigaction=define useithreads=undef, usemultiplicity=undef useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=define, use64bitall=define, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic', cppflags='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include' ccversion='', gccversion='4.1.2 20080704 (Red Hat 4.1.2-54)', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='gcc', ldflags =' -fstack-protector -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib /lib64 /usr/lib64 /usr/local/lib64 libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc libc=, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.5' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux/CORE' cccdlflags='-fPIC', lddlflags='-shared -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -L/usr/local/lib'

Characteristics of this binary (from libperl): Compile-time options: HAS_TIMES PERLIO_LAYERS PERL_DONT_CREATE_GVSV PERL_MALLOC_WRAP PERL_PRESERVE_IVUV USE_64_BIT_ALL USE_64_BIT_INT USE_LARGE_FILES USE_LOCALE USE_LOCALE_COLLATE USE_LOCALE_CTYPE USE_LOCALE_NUMERIC USE_PERLIO USE_PERL_ATOF Built under linux Compiled at Sep 24 2013 11:46:50 %ENV: PERL5LIB="/app/shared/lib/cpan64/lib/perl5/x86_64-linux/:/app/shared/lib/cpan64/lib/perl5:/app/shared/lib/cpan64/lib/perl5/i386-linux" PERL_BASEDIR="/software/x64/perl/5.16.3" PERL_BINDIR="/software/x64/perl/5.16.3/bin" PERL_SITE_BINDIR="/software/x64/perl/5.16.3/lib/5.16.3" PERL_VERSION="5.16.3" @INC: /app/shared/lib/cpan64/lib/perl5/x86_64-linux/ /app/shared/lib/cpan64/lib/perl5/x86_64-linux /app/shared/lib/cpan64/lib/perl5 /app/shared/lib/cpan64/lib/perl5/i386-linux /software/x64/perl/5.16.3/lib/site_perl/5.16.3/x86_64-linux /software/x64/perl/5.16.3/lib/site_perl/5.16.3 /software/x64/perl/5.16.3/lib/vendor_perl/5.16.3/x86_64-linux /software/x64/perl/5.16.3/lib/vendor_perl/5.16.3 /software/x64/perl/5.16.3/lib/5.16.3/x86_64-linux /software/x64/perl/5.16.3/lib/5.16.3 .

Sample output from demo_problem.pl: ... 27 new brokers_add 28 new brokers_add glibc detected perl: double free or corruption (!prev): 0x0000000016b64a00 *** ======= Backtrace: ========= /lib64/libc.so.6[0x2b024aeaa4af] /lib64/libc.so.6(cfree+0x4b)[0x2b024aeae7ab] /lib64/libc.so.6(fclose+0x14b)[0x2b024ae98d5b] /lib64/libnss_files.so.2(_nss_files_gethostbyname2_r+0x190)[0x2b0254f32bd0] ...

It always dies after _new, while performing brokers_add(). Separate threads seem to be used for each.

I got this stack trace while running the script via gdb perl Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x2aaab476d940 (LWP 25471)] 0x00002aaaabb89960 in __uflow () from /lib64/libc.so.6 (gdb) where

0 0x00002aaaabb89960 in __uflow () from /lib64/libc.so.6

1 0x00002aaaabb7d6b4 in _IO_getline_info_internal () from /lib64/libc.so.6

2 0x00002aaaabb860c9 in fgets_unlocked () from /lib64/libc.so.6

3 0x00002aaab477c33f in internal_getent () from /lib64/libnss_files.so.2

4 0x00002aaab477cb41 in _nss_files_gethostbyname2_r () from /lib64/libnss_files.so.2

5 0x00002aaaabbd876b in gaih_inet () from /lib64/libc.so.6

6 0x00002aaaabbda9ba in getaddrinfo () from /lib64/libc.so.6

7 0x00002aaaaf6a4ef4 in rd_getaddrinfo (nodesvc=, defsvc=0x70a624 "6667", flags=32, family=,

socktype=<value optimized out>, protocol=<value optimized out>, errstr=0x2aaab476d0d8) at rdaddr.c:168

8 0x00002aaaaf674204 in rd_kafka_broker_resolve (arg=) at rdkafka_broker.c:649

9 rd_kafka_broker_connect (arg=) at rdkafka_broker.c:1315

10 rd_kafka_broker_thread_main (arg=) at rdkafka_broker.c:4712

11 0x00002aaaaf6a5ed0 in _thrd_wrapper_function (aArg=) at tinycthread.c:624

12 0x00002aaaafb1b83d in start_thread () from /lib64/libpthread.so.0

13 0x00002aaaabbf118d in clone () from /lib64/libc.so.6

Sorry, but I don’t have a statically compiled perl to get the symbol table stuff.

regards, D Weber

From: Pavel Shaydo Sent: Tuesday, June 06, 2017 12:56 PM To: trinitum/perl-Kafka-Librd Cc: weberbiz ; Author Subject: Re: [trinitum/perl-Kafka-Librd] Possible thread race condition (#7)

Hi, I never had this issue. We using the module with perl 5.24 on linux on 3 node cluster and it works without problems. Can you provide me with a runnable script that exposes the problem for you? Also I would like to see perl -V output. I'll try to look into it, but unfortunately I don't have that much free time, so can't promise quick results. If you can investigate the problem yourself and create a merge request that would be awesome.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

weberbiz commented 7 years ago

Hi Pavel, If this helps, here’s how I ran the rdkafka_example program that ships with librdkafka:

for i in $(seq 1 100);do ./rdkafka_example -t test -p 0 -b "kf0001.dev3.awse1c.datasciences.tmcs:6667,kf0002.dev3.awse1e.datasciences.tmcs:6667,kf0003.dev3.awse1a.datasciences.tmcs:6667,kf0004.dev3.awse1b.datasciences.tmcs:6667,kf0005.dev3.awse1c.datasciences.tmcs:6667,kf0006.dev3.awse1e.datasciences.tmcs:6667" -P<<EOF; done

EOF % Type stuff and hit enter to send ....

I could not make it crash. It runs on the same box as the perl.

regards, D Weber

From: Pavel Shaydo Sent: Tuesday, June 06, 2017 12:56 PM To: trinitum/perl-Kafka-Librd Cc: weberbiz ; Author Subject: Re: [trinitum/perl-Kafka-Librd] Possible thread race condition (#7)

Hi, I never had this issue. We using the module with perl 5.24 on linux on 3 node cluster and it works without problems. Can you provide me with a runnable script that exposes the problem for you? Also I would like to see perl -V output. I'll try to look into it, but unfortunately I don't have that much free time, so can't promise quick results. If you can investigate the problem yourself and create a merge request that would be awesome.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

trinitum commented 7 years ago

Hi, note, that the attached script didn't make it into issue tracker