Closed yyyar closed 11 years ago
Hello Yaroslav,
Thanks for reporting this issue; it looks like webdis makes some assumptions about hiredis. I will have a look and try to reproduce it locally.
Nicolas
Hello Nicolas,
I am not an expert on the webdis code, so I might have overlooked something, but I was wondering if it can be a race condition.
Redis drops connections if they are idle, when this happens hiredis calls the disconnect callback (pool_on_disconnect) and webdis removes that connection from the pool, then hiredis releases the redisContext. I do not see any locks on the pool of connections, so it can happen then when a connection is been dropped and the context freed, webdis can assign the same connection to a new HTTP request, which it will end up using a connection with a freed redisContext.
@marcoPierleoni thanks, I will look into it. There are no locks since webdis has a connection pool per worker thread, but there could still be a race among several clients.
I also discovered yesterday that there is a known issue with SUBSCRIBE
in hiredis, but I'm not yet sure if it is related. I modified Redis to throw errors on SUBSCRIBE
but that did not kill webdis.
Still investigating...
I reproduced a crash.
I set redis with: maxclients 103
Then I started webdis with 100 connection. The first two calls to /SUBSCRIBE/channel are OK. With a third one webdis was killed.
If I remove -O3 in the compilation, webdis it is not killed anymore on the third SUBSCRIBE
Thanks a lot Marco, I'll try this scenario.
On 25 Jul 2012, at 23:49, Marco Pierleonireply@reply.github.com wrote:
I reproduced a crash.
I set redis with: maxclients 103
Then I started webdis with 100 connection. The first two calls to /SUBSCRIBE/channel are OK. With a third one webdis was killed.
If I remove -O3 in the compilation, webdis it is not killed anymore on the third SUBSCRIBE
Reply to this email directly or view it on GitHub: https://github.com/nicolasff/webdis/issues/53#issuecomment-7264688
Unfortunately it does look like the issue reported by @antirez in hiredis.
(gdb) bt #0 0x0000000000413938 in __redisGetSubscribeCallback (ac=0x68ddc0, reply=0x6, dstcb=0x7ffff73fad00) at hiredis/async.c:324 #1 0x0000000000413c84 in redisProcessCallbacks (ac=0x68ddc0) at hiredis/async.c:399 #2 0x0000000000413ea0 in redisAsyncHandleRead (ac=0x68ddc0) at hiredis/async.c:467 #3 0x0000000000409248 in redisLibeventReadEvent (fd=643, event=2, arg=0x68def0) at ./hiredis/adapters/libevent.h:15 #4 0x00007ffff7bc7f53 in event_base_loop () from /usr/lib/libevent-1.4.so.2 #5 0x0000000000403261 in worker_main (p=0x628f30) at worker.c:156 #6 0x00007ffff79acefc in start_thread (arg=0x7ffff73fb700) at pthread_create.c:304 #7 0x00007ffff76e759d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #8 0x0000000000000000 in ?? () (gdb) f 0 #0 0x0000000000413938 in __redisGetSubscribeCallback (ac=0x68ddc0, reply=0x6, dstcb=0x7ffff73fad00) at hiredis/async.c:324 324 if (reply->type == REDIS_REPLY_ARRAY) {
Hiredis fails to parse the reply object and leaves it broken (pointing to 0x6
in this case), after which dereferencing it crashes.
A temporary solution is to bump the number of connections allowed by Redis, since webdis creates a new connection for every call to SUBSCRIBE
, and does not reuse the pool connection for those clients.
My sincere apologies for this issue and the lack of an immediate fix, I will try to see if there is a workaround or if the fix in hiredis isn't too intrusive.
As described on the hiredis ticket, this should be fixed. I have updated webdis to use the latest version of hiredis and pushed it back to GitHub.
Thanks for reporting this problem!
Running webdis with ~100-150 connections. HBGET, MGET, GET, SUBSCRIBE, SMEMBERS, RPUSH commands are used.
Time-to-time, approximately once a day we have webdis crash. Here a gdb backtrace:
Program terminated with signal 11, Segmentation fault. (gdb) bt
0 0x00000000 in ?? ()
1 0xad445f70 in ?? ()
2 0x08058f44 in redisProcessCallbacks (ac=0xad445f70) at hiredis/async.c:403
3 0x08059109 in redisAsyncHandleRead (ac=0xad445f70) at hiredis/async.c:467
4 0x0804fbe0 in redisLibeventReadEvent (fd=378, event=2, arg=0xad45fb98) at ./hiredis/adapters/libevent.h:15
5 0xb773fce9 in event_base_loop () from /usr/lib/libevent-2.0.so.5
6 0xb7740a83 in event_base_dispatch () from /usr/lib/libevent-2.0.so.5
7 0x0804a656 in worker_main (p=0x97291f0) at worker.c:156
8 0xb771ed4c in start_thread () from /lib/i386-linux-gnu/tls/i686/nosegneg/libpthread.so.0
9 0xb765dace in clone () from /lib/i386-linux-gnu/tls/i686/nosegneg/libc.so.6
Starting webdis with ulimit -c 60000 "http_port": 80, "threads": 20, "pool_size": 5,
Redis 2.4.7 is on another machine.
Thank you in advance.