djveremix / redis

Automatically exported from code.google.com/p/redis
0 stars 0 forks source link

After a few days of operation Redis will "hang" while saving the DB to disk #292

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
The server (x86_64) has plenty of memory ~ 7.5GB out which Redis uses very 
little ~480MB.

The first time I had this issue was running 1.2.1 and it still occurs with 
1.2.6.

Whenever the "locking" occurs the behaviour is as follows:

21 Jul 03:14:18 - 10000 changes in 60 seconds. Saving...
21 Jul 03:14:19 - Background saving started by pid 25569
21 Jul 03:14:30 - DB saved on disk
21 Jul 03:14:31 - Background saving terminated with success
21 Jul 03:18:17 - 10000 changes in 60 seconds. Saving...
21 Jul 03:18:17 - Background saving started by pid 25723
21 Jul 03:18:29 - DB saved on disk
21 Jul 03:18:30 - Background saving terminated with success
21 Jul 03:23:31 - 10 changes in 300 seconds. Saving...
21 Jul 03:23:31 - Background saving started by pid 25919
21 Jul 03:23:43 - DB saved on disk
21 Jul 03:28:57 - DB saved on disk
21 Jul 03:34:10 - DB saved on disk
21 Jul 03:39:24 - DB saved on disk

----

And it goes on until the server is restarted. Sometimes the shutdown will 
actually succeeds, sometimes the server needs to be killed. If you and try and 
use redis-cli or any other redis client when this happening it'll just hang 
when you issue a command.

By looking at the munin charts from around that time I can see that the memory 
usage was consistent with what you'll see below and that this "DB saved on 
disk" loop causes a huge CPU spike.

INFO
redis_version:1.2.6
arch_bits:64
multiplexing_api:epoll
uptime_in_seconds:111520
uptime_in_days:1
connected_clients:280
connected_slaves:0
used_memory:504589264
used_memory_human:481.21M
changes_since_last_save:4421
bgsave_in_progress:0
last_save_time:1279802019
bgrewriteaof_in_progress:0
total_connections_received:476
total_commands_processed:12891953
role:master
db0:keys=20128,expires=2991

----
             total       used       free     shared    buffers     cached
Mem:          7687       7178        508          0        367       4890
-/+ buffers/cache:       1920       5766
Swap:         6047          7       6040

----

This server is used as storage for some Resque queues (which is pretty much 
RPUSH/LPOP) and some other things, mostly SET,GET,INCR,DEL.

Original issue reported on code.google.com by tftfmac...@gmail.com on 22 Jul 2010 at 12:40

GoogleCodeExporter commented 9 years ago
Have you considered upgrading?  I've been using a Redis 1.3.10 branch for a 
couple months on our production boxes with multi-month uptimes, and I've not 
suffered the same issue.

You would want to try for the release candidate of Redis 2.0, it's gotten a lot 
more testing than the version I'm running.

Original comment by josiah.c...@gmail.com on 22 Jul 2010 at 5:54

GoogleCodeExporter commented 9 years ago
Hello, never experienced this problem, nor received a bug report of this issue 
before.

This sounds like something very specific with your environment... btw 1.2.x is 
going to be deprecated in favor of 2.0.x, so please can you check if you have 
this bug with 2.0.x as well? Thank you!

Cheers,
Salvatore

Original comment by anti...@gmail.com on 30 Aug 2010 at 1:45

GoogleCodeExporter commented 9 years ago

Original comment by anti...@gmail.com on 30 Aug 2010 at 1:45

GoogleCodeExporter commented 9 years ago
We noticed this "Hang while DB Save" issue, too.

In our case I was able to debug redis. It turned out that we had multiple huge 
batches of commands(handled via pipelines) which were sent to redis in very 
short intervals. After the forked save-process returned, the main-process 
pretty much choked and became *very* slow while trying to process all the 
commands. I resolved this issue by making sure that the batches wouldnt become 
to big. Instead of a huge one I'm sending multiple smaller ones and now redis 
is back in action.

I thought I let you know.

Original comment by michael....@gmail.com on 16 Sep 2010 at 12:17

GoogleCodeExporter commented 9 years ago
Hello Michael,

please can you tell me if you are using virtual memory in your setup?

Thanks!
Salvatore

Original comment by anti...@gmail.com on 16 Sep 2010 at 12:26

GoogleCodeExporter commented 9 years ago
Hi Salvatore,

We did test with virtual memory and without. With virtual memory the effect 
took a bit longer to replicate but it happened, too. In the end the effect was 
the same.

Original comment by michael....@gmail.com on 16 Sep 2010 at 12:48

GoogleCodeExporter commented 9 years ago
For my setup, upgrading to 2.0.x solved it.

Original comment by tftfmac...@gmail.com on 2 Nov 2010 at 12:24