shogo82148 / Redis-Fast

fast perl binding for Redis database
https://metacpan.org/release/Redis-Fast
Other
25 stars 21 forks source link

Bad signal handling during blocking operation #11

Closed bjakubski closed 10 years ago

bjakubski commented 10 years ago

We discovered that Redis::Fast in some circumstances behaves badly when receiving signal during blocking operation (e.g. brpop).

Following code works ok when it receives signal 15 during brpop:

perl -MRedis::Fast -E 'my $c = Redis::Fast->new(reconnect=>2, every => 100, server => "localhost:6379"); $c->select("db"); $c->brpop("a", 100); '

Just adding $SIG{TERM} = sub {} (actual content of this sub does not matter) makes it ignore first TERM signal received and reissue brpop command (it appears it trigger reconnect code) second TERM signal crashes this program. ("double free").

When there is no reconnect enabled problem does not appear, so one needs to have both reconnect and some TERM singnal handler installed to reproduce it.

Tested on 0.08 with perl 5.10.1 and 5.16 I compiled Redis::Fast with DEBUG enabled and reproduced this issue - see below

$ perl -MRedisast -E '$SIG{TERM}= sub {warn "TERM handler called"}; my $c =->new(reconnect=>2, every => 100, server => "localhost:6379"); $c->select("db"); $c->brpop("a", 100); '
[lib/Redis/Fast.xs:668:XS_Redis__Fast__new]: start
[lib/Redis/Fast.xs:673:XS_Redis__Fast__new]: return 0x1683d30
[lib/Redis/Fast.xs:299:Redis__Fast_connect]: start
[lib/Redis/Fast.xs:232:__build_sock]: start
[lib/Redis/Fast.xs:177:wait_for_event]: select start, timeout is 0.100000
[lib/Redis/Fast.xs:182:wait_for_event]: select returns 1
[lib/Redis/Fast.xs:197:wait_for_event]: ready to write
[lib/Redis/Fast.xs:201:wait_for_event]: finish
[lib/Redis/Fast.xs:277:__build_sock]: finsih
[lib/Redis/Fast.xs:329:Redis__Fast_connect]: finish
[lib/Redis/Fast.xs:352:Redis__Fast_reconnect]: start
[lib/Redis/Fast.xs:360:Redis__Fast_reconnect]: finish
[lib/Redis/Fast.xs:550:Redis__Fast_run_cmd]: start SELECT
[lib/Redis/Fast.xs:552:Redis__Fast_run_cmd]: pid check: previous pid is 20468, now 20468
[lib/Redis/Fast.xs:580:Redis__Fast_run_cmd]: send command in sync mode
[lib/Redis/Fast.xs:585:Redis__Fast_run_cmd]: waiting response
[lib/Redis/Fast.xs:283:_wait_all_responses]: start
[lib/Redis/Fast.xs:177:wait_for_event]: select start, timeout is -1.000000
[lib/Redis/Fast.xs:182:wait_for_event]: select returns 1
[lib/Redis/Fast.xs:197:wait_for_event]: ready to write
[lib/Redis/Fast.xs:201:wait_for_event]: finish
[lib/Redis/Fast.xs:177:wait_for_event]: select start, timeout is -1.000000
[lib/Redis/Fast.xs:182:wait_for_event]: select returns 1
[lib/Redis/Fast.xs:193:wait_for_event]: ready to read
[lib/Redis/Fast.xs:423:Redis__Fast_sync_reply_cb]: 0x168fe00
[lib/Redis/Fast.xs:439:Redis__Fast_sync_reply_cb]: finish
[lib/Redis/Fast.xs:201:wait_for_event]: finish
[lib/Redis/Fast.xs:291:_wait_all_responses]: finish
[lib/Redis/Fast.xs:590:Redis__Fast_run_cmd]: finish SELECT
[lib/Redis/Fast.xs:352:Redis__Fast_reconnect]: start
[lib/Redis/Fast.xs:360:Redis__Fast_reconnect]: finish
[lib/Redis/Fast.xs:550:Redis__Fast_run_cmd]: start BRPOP
[lib/Redis/Fast.xs:552:Redis__Fast_run_cmd]: pid check: previous pid is 20468, now 20468
[lib/Redis/Fast.xs:580:Redis__Fast_run_cmd]: send command in sync mode
[lib/Redis/Fast.xs:585:Redis__Fast_run_cmd]: waiting response
[lib/Redis/Fast.xs:283:_wait_all_responses]: start
[lib/Redis/Fast.xs:177:wait_for_event]: select start, timeout is -1.000000
[lib/Redis/Fast.xs:182:wait_for_event]: select returns 1
[lib/Redis/Fast.xs:197:wait_for_event]: ready to write
[lib/Redis/Fast.xs:201:wait_for_event]: finish
[lib/Redis/Fast.xs:177:wait_for_event]: select start, timeout is -1.000000

##### HERE IT WAITS FOR brpop to return something, but we send it signal 15 

[lib/Redis/Fast.xs:182:wait_for_event]: select returns -1
[lib/Redis/Fast.xs:189:wait_for_event]: exception!!
[lib/Redis/Fast.xs:287:_wait_all_responses]: error: 2
[lib/Redis/Fast.xs:352:Redis__Fast_reconnect]: start
[lib/Redis/Fast.xs:360:Redis__Fast_reconnect]: finish
[lib/Redis/Fast.xs:580:Redis__Fast_run_cmd]: send command in sync mode
[lib/Redis/Fast.xs:585:Redis__Fast_run_cmd]: waiting response
[lib/Redis/Fast.xs:283:_wait_all_responses]: start
[lib/Redis/Fast.xs:177:wait_for_event]: select start, timeout is -1.000000
[lib/Redis/Fast.xs:182:wait_for_event]: select returns 1
[lib/Redis/Fast.xs:197:wait_for_event]: ready to write
[lib/Redis/Fast.xs:201:wait_for_event]: finish
[lib/Redis/Fast.xs:177:wait_for_event]: select start, timeout is -1.000000

##### HERE IT WAITS AGAIN for brpop to return something, but we send it signal 15 AGAIN

[lib/Redis/Fast.xs:182:wait_for_event]: select returns -1
[lib/Redis/Fast.xs:189:wait_for_event]: exception!!
[lib/Redis/Fast.xs:287:_wait_all_responses]: error: 2
[lib/Redis/Fast.xs:352:Redis__Fast_reconnect]: start
[lib/Redis/Fast.xs:360:Redis__Fast_reconnect]: finish
[lib/Redis/Fast.xs:609:Redis__Fast_run_cmd]: Finish BRPOP
TERM handler called at -e line 1.
[lib/Redis/Fast.xs:862:XS_Redis__Fast_DESTROY]: start
[lib/Redis/Fast.xs:864:XS_Redis__Fast_DESTROY]: free ac
[lib/Redis/Fast.xs:423:Redis__Fast_sync_reply_cb]: 0x16901b0
[lib/Redis/Fast.xs:431:Redis__Fast_sync_reply_cb]: redis feeing
[lib/Redis/Fast.xs:439:Redis__Fast_sync_reply_cb]: finish
[lib/Redis/Fast.xs:423:Redis__Fast_sync_reply_cb]: 0x16901b0
[lib/Redis/Fast.xs:431:Redis__Fast_sync_reply_cb]: redis feeing
*** glibc detected *** perl: double free or corruption (!prev): 0x00000000016901b0 ***
======= Backtrace: =========
/lib64/libc.so.6[0x381e876166]
/lib64/libc.so.6[0x381e878ca3]
/home/bj5004/perl5/lib/perl5/x86_64-linux-thread-multi/auto/Redis/Fast/Fast.so(+0xa0c5)[0x7f75bff0d0c5]
/home/bj5004/perl5/lib/perl5/x86_64-linux-thread-multi/auto/Redis/Fast/Fast.so(+0xed4d)[0x7f75bff11d4d]
/home/bj5004/perl5/lib/perl5/x86_64-linux-thread-multi/auto/Redis/Fast/Fast.so(+0x7a85)[0x7f75bff0aa85]
/usr/lib64/perl5/CORE/libperl.so(Perl_pp_entersub+0x5a5)[0x38322a6815]
/usr/lib64/perl5/CORE/libperl.so(Perl_call_sv+0x696)[0x383224c7a6]
/usr/lib64/perl5/CORE/libperl.so(Perl_sv_clear+0xb6)[0x38322b8dd6]
/usr/lib64/perl5/CORE/libperl.so(Perl_sv_free2+0x52)[0x38322b95d2]
/usr/lib64/perl5/CORE/libperl.so(Perl_leave_scope+0xe45)[0x38322d61f5]
/usr/lib64/perl5/CORE/libperl.so(Perl_pp_leave+0x105)[0x38322a6125]
/usr/lib64/perl5/CORE/libperl.so(Perl_runops_standard+0x16)[0x38322a4b06]
/usr/lib64/perl5/CORE/libperl.so(perl_run+0x338)[0x383224d0d8]
perl(main+0x154)[0x400e74]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x381e81ed1d]
perl[0x400c59]
======= Memory map: ========
00400000-00402000 r-xp 00000000 fd:00 282391                 /usr/bin/perl
00601000-00603000 rw-p 00001000 fd:00 282391                 /usr/bin/perl
01663000-01872000 rw-p 00000000 00:00 0                  [heap]
381e400000-381e420000 r-xp 00000000 fd:00 925185             /lib64/ld-2.12.so
381e61f000-381e620000 r--p 0001f000 fd:00 925185             /lib64/ld-2.12.so
381e620000-381e621000 rw-p 00020000 fd:00 925185             /lib64/ld-2.12.so
381e621000-381e622000 rw-p 00000000 00:00 0
381e800000-381e98b000 r-xp 00000000 fd:00 925188             /lib64/libc-2.12.so
381e98b000-381eb8a000 ---p 0018b000 fd:00 925188             /lib64/libc-2.12.so
381eb8a000-381eb8e000 r--p 0018a000 fd:00 925188             /lib64/libc-2.12.so
381eb8e000-381eb8f000 rw-p 0018e000 fd:00 925188             /lib64/libc-2.12.so
381eb8f000-381eb94000 rw-p 00000000 00:00 0
381ec00000-381ec17000 r-xp 00000000 fd:00 938218             /lib64/libpthread-2.12.so
381ec17000-381ee17000 ---p 00017000 fd:00 938218             /lib64/libpthread-2.12.so
381ee17000-381ee18000 r--p 00017000 fd:00 938218             /lib64/libpthread-2.12.so
381ee18000-381ee19000 rw-p 00018000 fd:00 938218             /lib64/libpthread-2.12.so
381ee19000-381ee1d000 rw-p 00000000 00:00 0
381f000000-381f002000 r-xp 00000000 fd:00 938224             /lib64/libdl-2.12.so
381f002000-381f202000 ---p 00002000 fd:00 938224             /lib64/libdl-2.12.so
381f202000-381f203000 r--p 00002000 fd:00 938224             /lib64/libdl-2.12.so
381f203000-381f204000 rw-p 00003000 fd:00 938224             /lib64/libdl-2.12.so
381f400000-381f483000 r-xp 00000000 fd:00 938221             /lib64/libm-2.12.so
381f483000-381f682000 ---p 00083000 fd:00 938221             /lib64/libm-2.12.so
381f682000-381f683000 r--p 00082000 fd:00 938221             /lib64/libm-2.12.so
381f683000-381f684000 rw-p 00083000 fd:00 938221             /lib64/libm-2.12.so
3820800000-3820816000 r-xp 00000000 fd:00 938228             /lib64/libresolv-2.12.so
3820816000-3820a16000 ---p 00016000 fd:00 938228             /lib64/libresolv-2.12.so
3820a16000-3820a17000 r--p 00016000 fd:00 938228             /lib64/libresolv-2.12.so
3820a17000-3820a18000 rw-p 00017000 fd:00 938228             /lib64/libresolv-2.12.so
3820a18000-3820a1a000 rw-p 00000000 00:00 0
3823c00000-3823c16000 r-xp 00000000 fd:00 938231             /lib64/libgcc_s-4.4.7-20120601.so.1
3823c16000-3823e15000 ---p 00016000 fd:00 938231             /lib64/libgcc_s-4.4.7-20120601.so.1
3823e15000-3823e16000 rw-p 00015000 fd:00 938231             /lib64/libgcc_s-4.4.7-20120601.so.1
3829400000-3829471000 r-xp 00000000 fd:00 938248             /lib64/libfreebl3.so
3829471000-3829670000 ---p 00071000 fd:00 938248             /lib64/libfreebl3.so
3829670000-3829672000 r--p 00070000 fd:00 938248             /lib64/libfreebl3.so
3829672000-3829673000 rw-p 00072000 fd:00 938248             /lib64/libfreebl3.so
3829673000-3829677000 rw-p 00000000 00:00 0
3829c00000-3829c07000 r-xp 00000000 fd:00 938249             /lib64/libcrypt-2.12.so
3829c07000-3829e07000 ---p 00007000 fd:00 938249             /lib64/libcrypt-2.12.so
3829e07000-3829e08000 r--p 00007000 fd:00 938249             /lib64/libcrypt-2.12.so
3829e08000-3829e09000 rw-p 00008000 fd:00 938249             /lib64/libcrypt-2.12.so
3829e09000-3829e37000 rw-p 00000000 00:00 0
382c800000-382c802000 r-xp 00000000 fd:00 938246             /lib64/libutil-2.12.so
382c802000-382ca01000 ---p 00002000 fd:00 938246             /lib64/libutil-2.12.so
382ca01000-382ca02000 r--p 00001000 fd:00 938246             /lib64/libutil-2.12.so
382ca02000-382ca03000 rw-p 00002000 fd:00 938246             /lib64/libutil-2.12.so
382e400000-382e416000 r-xp 00000000 fd:00 917625             /lib64/libnsl-2.12.so
382e416000-382e615000 ---p 00016000 fd:00 917625             /lib64/libnsl-2.12.so
382e615000-382e616000 r--p 00015000 fd:00 917625             /lib64/libnsl-2.12.so
382e616000-382e617000 rw-p 00016000 fd:00 917625             /lib64/libnsl-2.12.so
382e617000-382e619000 rw-p 00000000 00:00 0
3832200000-3832362000 r-xp 00000000 fd:00 278941             /usr/lib64/perl5/CORE/libperl.so
zsh: abort (core dumped)  perl -MRedis::Fast -E
shogo82148 commented 10 years ago

i think it is similar problem with #9. https://github.com/shogo82148/Redis-Fast/compare/fix-issue-9-avoid-double-free this branch will also fix this issue. try it please.

vsespb commented 10 years ago

double free related to #9 but ignoring signal looks related to #10 ?

bjakubski commented 10 years ago

Thanks for feedback!

It does appear related to other issues, but I decided to post it separatel, as I have nice, small and reproducible test case (and some details differ).

@shogo82148 this branch does indeed make "double free" go away, but main problem is that first TERM signal gets "eaten" somewhere. That in turn looks like #10 but I've never seen that without reconnect enabled. I suspect that select() being interrupted by signal is mistaken as a lost connection and reconnect code triggers in

shogo82148 commented 10 years ago

oops, sorry, I had misunderstood the problem.

I suspect that select() being interrupted by signal is mistaken as a lost connection and reconnect code triggers in

i think so, too. Redis::Fast always reconnect if select() returns error. but it should not reconnect with errno == EINTR. I will fix it.

shogo82148 commented 10 years ago

pr #12 will fix this. try it please.

shogo82148 commented 10 years ago

I merged #12 and released Redis::Fast 0.0.9.