yannh / redis-dump-go

Backup & Restore your Redis server - FAST
MIT License
274 stars 58 forks source link

Timeouts on large ZSETs #11

Closed linydquantil closed 3 years ago

linydquantil commented 4 years ago

redis version: 5.0.4 Sometimes it(redis-dump-go) hangs and waits for a long time.

yannh commented 4 years ago

hi @linydquantil , is it possible you re having connectivity issues / wifi? I m not aware of these kind of issues so far.

linydquantil commented 4 years ago

redis server is on localhost. for example:

Keyspace

db0:keys=77,expires=0,avg_ttl=0 db1:keys=81636,expires=0,avg_ttl=0

./redis-dump-go -db 1 > dump.resp.1

Database 1: 52800 element dumpedError: read tcp 127.0.0.1:60858->127.0.0.1:6379: i/o timeout Error: read tcp 127.0.0.1:60858->127.0.0.1:6379: i/o timeout Error: read tcp 127.0.0.1:60858->127.0.0.1:6379: i/o timeout Error: read tcp 127.0.0.1:60858->127.0.0.1:6379: i/o timeout Database 1: 72800 element dumped Database 1: 81636 element dumped

and then it hangs So is it still dumping data?

yannh commented 4 years ago

The tool opens 10 connections by default, it seems like some of these time out.. Can you try with -n 1 or -n 2 ? If there is any public dataset I could try to reproduce this on it would be quite useful.

linydquantil commented 4 years ago

I found that when the key type is zset, and the value of the key is too large, redis-dump-go will be hanged

linydquantil commented 4 years ago

When the key type is zset, and its value has 500w pieces of data, its output will be very slow

yannh commented 4 years ago

Hi @linydquantil , 500w? 500 items?

linydquantil commented 4 years ago

Yes, the func is image

yannh commented 4 years ago

Right - but is that Redis command itself slow? It seems i can increase timeouts so it doesnt timeout, but not sure why it would be so slow.

linydquantil commented 4 years ago

If only get data from redis by using zrange of redis command, it takes about 1 min.

linydquantil commented 4 years ago

But, when we use "radix.Cmd(&val,"ZRANGE",key...)", it will become very slow.

linydquantil commented 4 years ago

I think it may be a slow step to format the data obtained from redis into val. Whether this step can be optimized. Thanks

linydquantil commented 4 years ago

if you have any idea, please let me know

mpartio commented 3 years ago

I noticed the same problem with redis-dump-go seemingly hanging when dumping a large ZSET. I have a redis database with just one key, type ZSET with 1259853 entries, size ~170M.

Accessing for example the first 10000 entries with redis-cli is near instantaneous, but redis-dump-go just seems to hang and does not output anything. Using perf top I see that it's doing something but nothing comes out. I've waited for more than an hour until hitting ctrl+c. Dump is taken inside the container where redis is running.

[root@5fdd0bc6fb04 tmp]# ./redis-dump-go > out
Database 0: 1 element dumped

perf top:

  31.37%  redis-dump-go       [.] runtime.memmove
  23.29%  redis-dump-go       [.] runtime.greyobject
  20.93%  redis-dump-go       [.] runtime.scanobject
  16.19%  redis-dump-go       [.] runtime.findObject
   0.42%  redis-dump-go       [.] runtime.memclrNoHeapPointers
yannh commented 3 years ago

Hi @mpartio , interesting, I have not tested performance with single entries this large. I can't promise when I will have time to investigate, but with the information you provided I might be able to reproduce. Thanks!

mpartio commented 3 years ago

I tracked down the issue to string concatenation, namely this loop here:

https://github.com/yannh/redis-dump-go/blob/5723398d075a8ac46fd0c5deb0191ebb9730430a/redisdump/redisdump.go#L60

Maybe using a string builder would be better in case very large strings are created?

yannh commented 3 years ago

Yep - I also think this is the problem. On it.

yannh commented 3 years ago

Just pushed v0.4.5 . Now I'll go cry myself to sleep :rofl:

image

Please let me know if it helps.. It should be a nice little performance improvement for most use cases. Thanks for the pointer :+1:

linydquantil commented 3 years ago

Hi, @yannh @mpartio well done, I just test the latest redis-dump-go.

# Keyspace
db0:keys=293,expires=17,avg_ttl=0
db1:keys=7515,expires=0,avg_ttl=0                 my large key is in db1
db2:keys=3996,expires=0,avg_ttl=0
db3:keys=2198,expires=0,avg_ttl=0
db4:keys=500002,expires=0,avg_ttl=0
db5:keys=628,expires=251,avg_ttl=0
db12:keys=62,expires=0,avg_ttl=0

the command result is

[root@bos-test-redis tmp]# ./redis-dump-go -db 0 > dump.resp
Database 0: 293 element dumped
[root@bos-test-redis tmp]# ./redis-dump-go -db 1 > dump.resp 
Database 1: 3700 element dumpedError: read tcp 127.0.0.1:59440->127.0.0.1:6379: i/o timeout
Error: read tcp 127.0.0.1:59440->127.0.0.1:6379: i/o timeout
Error: read tcp 127.0.0.1:59440->127.0.0.1:6379: i/o timeout
Error: read tcp 127.0.0.1:59440->127.0.0.1:6379: i/o timeout
Error: read tcp 127.0.0.1:59440->127.0.0.1:6379: i/o timeout
Error: read tcp 127.0.0.1:59440->127.0.0.1:6379: i/o timeout
Error: read tcp 127.0.0.1:59440->127.0.0.1:6379: i/o timeout
Database 1: 3800 element dumpedError: read tcp 127.0.0.1:59434->127.0.0.1:6379: i/o timeout
Error: read tcp 127.0.0.1:59434->127.0.0.1:6379: i/o timeout
Error: read tcp 127.0.0.1:59438->127.0.0.1:6379: i/o timeout
Database 1: 5700 element dumpedError: read tcp 127.0.0.1:59436->127.0.0.1:6379: i/o timeout
Error: read tcp 127.0.0.1:59436->127.0.0.1:6379: i/o timeout
Error: read tcp 127.0.0.1:59436->127.0.0.1:6379: i/o timeout
Error: read tcp 127.0.0.1:59436->127.0.0.1:6379: i/o timeout
Error: read tcp 127.0.0.1:59436->127.0.0.1:6379: i/o timeout
Error: read tcp 127.0.0.1:59436->127.0.0.1:6379: i/o timeout
Error: read tcp 127.0.0.1:59436->127.0.0.1:6379: i/o timeout
Error: read tcp 127.0.0.1:59436->127.0.0.1:6379: i/o timeout
Database 1: 5800 element dumpedError: read tcp 127.0.0.1:32884->127.0.0.1:6379: i/o timeout
Error: read tcp 127.0.0.1:32884->127.0.0.1:6379: i/o timeout
Database 1: 7515 element dumped
[root@bos-test-redis tmp]# ./redis-dump-go -db 2 > dump.resp 
Database 2: 3996 element dumped
[root@bos-test-redis tmp]# ./redis-dump-go -db 3 > dump.resp 
Database 3: 2198 element dumped
[root@bos-test-redis tmp]# ./redis-dump-go -db 4 > dump.resp 
Database 4: 500158 element dumped

it will cause some errors

yannh commented 3 years ago

@linydquantil that seems to be a timeout in the lower level Redis library the tool is using (Radix). I've released v0.4.6 with a 5m timeout. All those are pretty difficult to tests without integration tests, which I haven't found the time to write just yet. Your continuous support is much appreciated.

linydquantil commented 3 years ago

@yannh i use v0.4.6 to do some tests. this tool is good for me

[root@bos-test-redis tmp]# ./redis-dump-go -db 1 > dump.resp
Database 1: 7515 element dumped

Next, I will do a more detailed test.

yannh commented 3 years ago

Well that took only a year ;) Sorry for the long delay, I didnt realise the values might have been this large. Thanks for sticking around! :muscle:

mpartio commented 3 years ago

Thanks @yannh, with the 0.4.5 the dump is near-instantaneous. I do have problems importing the result to a new redis instance though:

$ redis-cli --pipe < dump.resp
Error writing to the server: Connection reset by peer

This could be something in my environment though, as dump.resp looks like correct redis protocol. I have to dig deeper here to find out what's wrong. Thank you for your great support!

linydquantil commented 3 years ago

@mpartio you can test v0.4.6

yannh commented 3 years ago

@mpartio the redis-cli might have similar timeouts as redis-dump-go... I'll be closing this ticket for now, since this does look unrelated. Feel free to re-open or create a new ticket should the need arise. Thanks!