Lachim / redis

Automatically exported from code.google.com/p/redis
2 stars 0 forks source link

Seg Fault, apparently related to pub/sub #560

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What version of Redis you are using, in what kind of Operating System?
2.2.7 on Ubuntu 10.04

What is the problem you are experiencing?
Seg Fault

What steps will reproduce the problem?
It seems to happen after running for 6-7 hours. This was not happening on 
2.0.2, but has been since upgrading to 2.2.7.

Do you have an INFO output? Please past it here.
redis_git_sha1:a7fa2baf
redis_git_dirty:0
arch_bits:64
multiplexing_api:epoll
process_id:17022
uptime_in_seconds:10971
uptime_in_days:0
lru_clock:602600
used_cpu_sys:495.61
used_cpu_user:911.12
used_cpu_sys_childrens:0.00
used_cpu_user_childrens:0.00
connected_clients:1291
connected_slaves:0
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:27
used_memory:23827313960
used_memory_human:22.19G
used_memory_rss:30858354688
mem_fragmentation_ratio:1.30
use_tcmalloc:0
loading:0
aof_enabled:0
changes_since_last_save:1604522
bgsave_in_progress:0
last_save_time:1306249274
bgrewriteaof_in_progress:0
total_connections_received:7203
total_commands_processed:21741645
expired_keys:16895
evicted_keys:0
keyspace_hits:2919521
keyspace_misses:729233
hash_max_zipmap_entries:64
hash_max_zipmap_value:512
pubsub_channels:134489
pubsub_patterns:0
vm_enabled:0
role:master
allocation_stats:2=172,6=94,7=1,8=3235066,9=11041,10=2877671,11=14324558,12=417

If it is a crash, can you please paste the stack trace that you can find in
the log file or on standard output? This is really useful for us!
[17022] 24 May 10:01:14 * Server started, Redis version 2.2.7
[17022] 24 May 10:02:43 * DB loaded from disk: 89 seconds
[17022] 24 May 10:02:43 * The server is now ready to accept connections on port 
6379
[17022] 24 May 13:04:05 # ======= Ooops! Redis 2.2.7 got signal: -11- =======
[17022] 24 May 13:04:05 # redis_version:2.2.7
[17022] 24 May 13:04:05 # 
/usr/local/bin/redis-server(_addReplyObjectToList+0x32) [0x417a32]
[17022] 24 May 13:04:05 # 
/usr/local/bin/redis-server(_addReplyObjectToList+0x32) [0x417a32]
[17022] 24 May 13:04:05 # 
/usr/local/bin/redis-server(pubsubPublishMessage+0x56) [0x42c656]
[17022] 24 May 13:04:05 # /usr/local/bin/redis-server(publishCommand+0x15) 
[0x42c7a5]
[17022] 24 May 13:04:05 # /usr/local/bin/redis-server(call+0x23) [0x40f543]
[17022] 24 May 13:04:05 # /usr/local/bin/redis-server(processCommand+0x267) 
[0x40f857]
[17022] 24 May 13:04:05 # /usr/local/bin/redis-server(processInputBuffer+0x57) 
[0x418437]
[17022] 24 May 13:04:05 # /usr/local/bin/redis-server(readQueryFromClient+0x5d) 
[0x4184ed]
[17022] 24 May 13:04:05 # /usr/local/bin/redis-server(aeProcessEvents+0x153) 
[0x40c103]
[17022] 24 May 13:04:05 # /usr/local/bin/redis-server(aeMain+0x2e) [0x40c34e]
[17022] 24 May 13:04:05 # /usr/local/bin/redis-server(main+0xf7) [0x4113d7]
[17022] 24 May 13:04:05 # /lib/libc.so.6(__libc_start_main+0xfd) 
[0x7f15d4adec4d]
[17022] 24 May 13:04:05 # /usr/local/bin/redis-server() [0x40b679]

Please provide any additional information below.

Original issue reported on code.google.com by wbm...@gmail.com on 24 May 2011 at 6:22

GoogleCodeExporter commented 8 years ago
Very bad! thanks for reporting. This could be related to many things similar to 
bug #503 that is now fixed in 2.2 so it is not that stuff. Either the client 
structure is broken when we publish to one of the clients waiting for data, or 
the pushed object is invalid in some way. I'll look at this tomorrow morning 
for sure.

If we'll not have any luck trying to understand where the problem is we'll try 
to ask for some help given that you can reproduce the problem ;)

Cheers,
Salvatore

Original comment by anti...@gmail.com on 24 May 2011 at 6:32

GoogleCodeExporter commented 8 years ago
I'm happy to provide any assistance I can in tracking this one down. 
Unfortunately, it seems to happen only after many hours on our (quite active) 
main production instance.

Original comment by wbm...@gmail.com on 24 May 2011 at 6:44

GoogleCodeExporter commented 8 years ago
Thanks, one thing that can help is to compile redis with 'make noopt' and run 
this version. The next stack dump will be more helpful then this. Even better 
after compiling with 'make noopt' you could run Redis this way:

gdb ./redis-server
run /path/to/redis.conf

And wait for the crash. When it crashes in gdb type:

bt

and this will output the back trace.

having this info can be really valuable!

Salvatore

Original comment by anti...@gmail.com on 25 May 2011 at 9:27

GoogleCodeExporter commented 8 years ago
Another interesting this is if this happes with 2.2.6 as well. It is not 
impossible that this was introduced with 2.2.7.

Original comment by anti...@gmail.com on 25 May 2011 at 9:46

GoogleCodeExporter commented 8 years ago
Nevermind, I'm able to reproduce the problem so no further testing needed in 
your side. I'll fix it ASAP and report back here.

Cheers,
Salvatore

Original comment by anti...@gmail.com on 25 May 2011 at 10:10

GoogleCodeExporter commented 8 years ago
Fixed in all github main beanches. Will release 2.2.9 ASAP. Thx for cooperation!

Original comment by anti...@gmail.com on 25 May 2011 at 10:58

GoogleCodeExporter commented 8 years ago
Great, thanks for addressing this so quickly. I'll roll out the 2.2 branch 
containing this fix.

Original comment by wbm...@gmail.com on 25 May 2011 at 6:51