Open GoogleCodeExporter opened 8 years ago
Thank you for this report. Can you be more specific about the commands you
execute on the Java side? In particular the key names and sequence of commands
(when are the BRPOPLPUSH operations unblocked, when are the LPUSH operations
called) will be helpful in tracking down how this assert can fail. I've looked
into this for a bit, but was unable to find a sequence to reproduce this issue.
Original comment by pcnoordh...@gmail.com
on 6 Jul 2011 at 10:54
If you could also send us a script to reproduce that we will be able to fix it
much faster, thanks!
Original comment by anti...@gmail.com
on 6 Jul 2011 at 11:08
I've encountered similar bug, tracked it down to some point and wrote script to
reproduce it. The script is attached to that comment and also available at
https://gist.github.com/1188698
Think I just wrote it down here:
--------------------
#!/bin/sh
src/redis-cli -r 10000 LPUSH a data &
src/redis-cli -r 10000 BRPOPLPUSH a b 0 &
src/redis-cli -r 10000 BRPOPLPUSH b c 0 &
wait
--------------------
So, to reproduce that bug you need to repeat in 3 concurrent loops the
following commands:
LPUSH a data
BRPOPLPUSH a b 0
BRPOPLPUSH b c 0
At some point in time, usually about 1 seconds since script start Redis stops
with ASSERTION FAILED message. Interesting thing is that sometimes the bug just
does not arise and script works well. In that case I restart Redis server for
some time and eventually the bug will appears again.
I am very worried about this error, and hope you can find the time to fix it.
Linux foo 2.6.39-0-generic #5~20110427-Ubuntu SMP Wed Apr 27 17:41:08 UTC 2011
i686 i686 i386 GNU/Linux
Intel i3
Redis output
------------
foo@foo:/opt/redis-2.2.10$ ./src/redis-server
[1029] 02 Sep 23:28:06 # Warning: no config file specified, using the default
config. In order to specify a config file use 'redis-server /path/to/redis.conf'
[1029] 02 Sep 23:28:06 * Server started, Redis version 2.2.10
[1029] 02 Sep 23:28:06 # WARNING overcommit_memory is set to 0! Background save
may fail under low memory condition. To fix this issue add
'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the
command 'sysctl vm.overcommit_memory=1' for this to take effect.
[1029] 02 Sep 23:28:06 * The server is now ready to accept connections on port
6379
[1029] 02 Sep 23:28:06 - 0 clients connected (0 slaves), 539480 bytes in use
[1029] 02 Sep 23:28:08 - Accepted 127.0.0.1:34708
[1029] 02 Sep 23:28:08 - Accepted 127.0.0.1:34709
[1029] 02 Sep 23:28:08 - Accepted 127.0.0.1:34710
[1029] 02 Sep 23:28:08 - Accepted 127.0.0.1:34711
[1029] 02 Sep 23:28:08 - Accepted 127.0.0.1:34712
[1029] 02 Sep 23:28:08 # === ASSERTION FAILED ===
[1029] 02 Sep 23:28:08 # ==> networking.c:706 'c->argc == 0' is not true
[1029] 02 Sep 23:28:08 # (forcing SIGSEGV in order to print the stack trace)
[1029] 02 Sep 23:28:08 # ======= Ooops! Redis 2.2.10 got signal: -11- =======
[1029] 02 Sep 23:28:08 # redis_version:2.2.10
redis_git_sha1:00000000
redis_git_dirty:0
arch_bits:32
multiplexing_api:epoll
process_id:1029
uptime_in_seconds:2
uptime_in_days:0
lru_clock:1473944
used_cpu_sys:0.00
used_cpu_user:0.00
used_cpu_sys_childrens:0.00
used_cpu_user_childrens:0.00
connected_clients:5
connected_slaves:0
client_longest_output_list:0
client_biggest_input_buf:42
blocked_clients:0
used_memory:579400
used_memory_human:565.82K
used_memory_rss:1220608
mem_fragmentation_ratio:2.11
use_tcmalloc:0
loading:0
aof_enabled:0
changes_since_last_save:16
bgsave_in_progress:0
last_save_time:1314973686
bgrewriteaof_in_progress:0
total_connections_received:5
total_commands_processed:15
expired_keys:0
evicted_keys:0
keyspace_hits:15
keyspace_misses:14
hash_max_zipmap_entries:512
hash_max_zipmap_value:64
pubsub_channels:0
pubsub_patterns:0
vm_enabled:0
role:master
allocation_stats:4=4,6=1,8=5,9=9,10=42,11=11,12=10272,13=45,14=33,15=30,16=37,17
=16,18=10,19=23,20=6,21=2,22=2,23=2,24=32,25=2,26
[1029] 02 Sep 23:28:08 # ./src/redis-server(_redisAssert+0x57) [0x8074067]
[1029] 02 Sep 23:28:08 # ./src/redis-server(processMultibulkBuffer+0x270)
[0x805d1c0]
[1029] 02 Sep 23:28:08 # ./src/redis-server(processInputBuffer+0x9a) [0x805d2ba]
[1029] 02 Sep 23:28:08 # ./src/redis-server(readQueryFromClient+0x70)
[0x805d370]
[1029] 02 Sep 23:28:08 # ./src/redis-server(aeProcessEvents+0x12e) [0x8050f0e]
[1029] 02 Sep 23:28:08 # ./src/redis-server(aeMain+0x37) [0x80512a7]
[1029] 02 Sep 23:28:08 # ./src/redis-server(main+0x111) [0x8056491]
[1029] 02 Sep 23:28:08 # /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xe7)
[0xcd5e37]
[1029] 02 Sep 23:28:08 # ./src/redis-server() [0x80502f1]
Segmentation fault
Original comment by senotru...@gmail.com
on 2 Sep 2011 at 2:37
Attachments:
Hello, I can confirm that:
1) The script is actually able to reproduce the bug.
2) That the bug is still present in Redis 2.2.12.
I'm leaving for honeymoon but this is a critical stuff. Pieter: are you working
this week? Otherwise I can care about it as this looks not too hard to fix even
from an hotel ;)
Thank you very much for discovering this... and especially for providing a
trigger script.
Original comment by anti...@gmail.com
on 3 Sep 2011 at 9:42
I forgot to add that the bug is present into unstable as well.
Original comment by anti...@gmail.com
on 3 Sep 2011 at 9:43
The following script also triggers the bug:
-----------------------------
#!/bin/sh
./redis-cli -r 2 BRPOPLPUSH a b 0 &
./redis-cli -r 1 BRPOPLPUSH b c 0 &
sleep 1
./redis-cli -r 1 LPUSH a data &
wait
-----------------------------
It shows even more clearly what is going on. The problem is a small bug that is
triggered when you chain BRPOPLPUSH with other blocking commands like above.
The following change appears to be a small and clean fix for this:
https://github.com/hampus/redis/commit/c47d152c8d96415de1af994b1a4bb3e0347caef3.
It's probably just a typo that hasn't been discovered before.
I may have missed something, because I haven't been able to put much time into
this today. Neither of the scripts trigger any assertions anymore and all the
tests work, at least. Pieter, perhaps you could have a look at the patch?
Cheers,
Hampus Wessman
Original comment by hampus.w...@gmail.com
on 7 Sep 2011 at 5:30
Thank you Hampus, finally monday I'll be back at work and will check/merge this
fix as well.
Original comment by anti...@gmail.com
on 10 Sep 2011 at 11:44
Have a nice time until then, Salvatore!
Original comment by hampus.w...@gmail.com
on 10 Sep 2011 at 12:13
Thanks Hampus! Back finally :) Definitely a typo, helped by the fact that "c"
and "origclient" are lame names...
Btw now that is merged, writing a test right now!
Salvatore
Original comment by anti...@gmail.com
on 12 Sep 2011 at 8:33
regression added to 2.4 and unstable, closing the issue, thanks everybody.
We'll release a new 2.4 RC and 2.2.13 containing this fix today.
Salvatore
Original comment by anti...@gmail.com
on 12 Sep 2011 at 8:57
Thank you, Salvatore!
I'm glad I took part.
Stan.
Original comment by senotru...@gmail.com
on 12 Sep 2011 at 9:01
Original issue reported on code.google.com by
kevin.clark@gmail.com
on 22 Jun 2011 at 10:16