redis / riot

🧨 Get data in & out of Redis with RIOT
http://redis.github.io/riot
Apache License 2.0
276 stars 39 forks source link

[riot-redis] Live replication not working on production instances. Huge memory usage. Data not replicated. #47

Closed mrnovalles closed 3 years ago

mrnovalles commented 3 years ago

While testing in our ci and staging environments live replication was working as expected but once we moved the replication to production we observed several issues.

Scenario description

Issue description

Command:

./riot-redis --info -h production-src-redis -p 6379 --db=0 replicate -h production-dst-encrypted-redis -p 6379 -t --db=0 --pass=apasswordhere --no-verify-peer --live

While running that command the output is:

Opening writer
Opening reader
Replicating from redis://production-src-redis█
mrnovalles commented 3 years ago

After running the task for 60 minutes, the error logs show:

Could not get value: Invalid first byte: 55 (7) at buffer index 1 decoding using RESP2
Replicating from redis://production-src-redis█Could not get value: Java heap space
Could not get value: Invalid first byte: 55 (7) at buffer index 1 decoding using RESP2
Reconnecting, last destination was production-src-rediss.com/10.11.3.0:6379
Reconnected to production-src-rediss.com:6379
Replicating from redis://production-src-redis█Could not get value: Java heap space
Could not get value: Invalid first byte: 55 (7) at buffer index 1 decoding using RESP2
Reconnecting, last destination was production-src-rediss.com/10.11.3.0:6379
Reconnected to production-src-rediss.com:6379
jruaux commented 3 years ago

What kind of load are you putting on the src (ops/sec)? redis-cli info stats I'm thinking riot can't keep up with the rate of change.

Also, do you have any big keys on the src? https://redis.io/topics/rediscli#scanning-for-big-keys

phuesler commented 3 years ago

There is some traffic on there. About 200 data changes per second. I don't have the stats handy right now.

How does riot deal with the pub/sub updates while it is transferring the initial data? Is it storing them to apply them later?

jruaux commented 3 years ago

yes keyspace notifications are queued. I just changed some of the notification queuing mechanism: updates to the same key are now deduplicated so memory usage should be lower if you had repeatedly updated keys. I also added a separate status bar for the live part of the replication process. Can you try the latest release and let me know if things improved?

phuesler commented 3 years ago

Thank you, @jruaux, we'll have a look.

mrnovalles commented 3 years ago

We've tried the latest version. The progress bar for Live replication does help to know what the tool is doing. However, we still experience, though smaller than before, sustained peaks of memory usage of around 5.5GB (when live-replicating a redis instance of 2.6GB).

Additionally, this redis instance in question, which runs on Amazon Elasticache has:

215 set commands/sec
320 get commands/sec

With this load, the replication was lagging behind and at times was not being pushed through.

We played around with the parameters --flush-interval=5 instead of 50 and --notification-queue=10000 instead of the default of 1000 We found that a written key on the src instance takes ~180seconds until it is replicated to the dst instance with these changes.

Any suggestion on to how we can better reduce that replication lag? Thanks for the continuos support on this.

jruaux commented 3 years ago

Hi,

I found a potential culprit regarding the memory usage. Keyspace notification callbacks were not discarded if the queue was full. This is fixed in the latest release v.2.3.2, and you will also see options to set notification queue and reader queue capacities. Regarding replication lag I'm not exactly sure what the root cause is. I would not lower the flush interval but instead increase it (1000 for example to flush every second) as this forces items not yet in a complete batch to be replicated over. Can you also try with bigger batch size (100, 500) and # threads (8 for example)? The idea is to maximize network utilization. If that does not help we will need to have more metrics around the data you are replicating, like data structure type (string, hash, list, ...) and value sizes. If you can run redis-cli --bigkeys that will give us an idea.

mrnovalles commented 3 years ago

I've tried with this params:

./riot-redis --info -h src_host -p 6379 --db=0 replicate -h dst_host -p 6379 -t --db=0 --pass=${REDIS_AUTH} --no-verify-peer --live --flush-interval=1000 --threads=8

That made the memory consumption peak at 8GB. Which at this point, stopped being the main issue. The replication lag is still present, and it's longer that what is previously was (around 5mins now).

Also, how could I an you also try with bigger batch size (100, 500) ? What would be the parameter to change there? I'll add some info on the key types in the instance on a following comment.

mrnovalles commented 3 years ago

Running redis-cli --bigkeys gave me (redacted key names):

-------- summary -------
Sampled 1885849 keys in the keyspace!
Total key length in bytes is 28819421 (avg len 15.28)
Biggest   list found 'foo' has 309 items
Biggest   hash found 'bar' has 9 fields
Biggest string found 'bing' has 24 bytes
Biggest    set found 'faz' has 74822 members
Biggest   zset found 'foz' has 7374431 members

7161 lists with 40214 items (00.38% of keys, avg size 5.62)
1339182 hashs with 1953140 fields (71.01% of keys, avg size 1.46)
539425 strings with 7198840 bytes (28.60% of keys, avg size 13.35)
0 streams with 0 entries (00.00% of keys, avg size 0.00)
67 sets with 695212 members (00.00% of keys, avg size 10376.30)
14 zsets with 15486335 members (00.00% of keys, avg size 1106166.79)

While analysis using RedisInsights showed: image

Apparently there are some 13 ZSET keys taking up 1.70GB of space.

jruaux commented 3 years ago

ok so what I suppose is happening is that RIOT can't keep up with the rate of change. Are the ZSETs updated constantly? If so that would mean for each ZSET modification RIOT has to DUMP and RESTORE the whole key. Do you have a key naming convention? If so, could you try using --scan-match [^myprefix]* in order to avoid the zset prefix myprefix?

phuesler commented 3 years ago

That is worth a try. Yes, I think our larger ZSET get updated regularly. I just went over the redis documentation again. If I understand correctly, it is the case that the redis notification does not contain the sets member key but only the sets key.

If that is indeed so, I don't think using keyspace notifications is feasible for our specific use case.

mrnovalles commented 3 years ago

Hey @jruaux thanks for your answer and the time you've spent with us on this issue. So, yes, you are right, adding a --scan-match='[^myprefix]*' option when running --live replication makes it work and removes the replication lag. Apparently, given our usage of redis and the key size we have, we won't be able to fully replicate the content of A to B.

jruaux commented 3 years ago

Thanks for confirming that this was the issue.