apache / kvrocks

Apache Kvrocks is a distributed key value NoSQL database that uses RocksDB as storage engine and is compatible with Redis protocol.
https://kvrocks.apache.org/
Apache License 2.0
3.47k stars 450 forks source link

Add option to disable incremental sync in kvrocks2redis #2165

Open QQxiaoyuyu opened 6 months ago

QQxiaoyuyu commented 6 months ago

Search before asking

Version

2.8

Minimal reproduce step

Looking downwards

What did you expect to see?

Looking downwards

What did you see instead?

Looking downwards

Anything Else?

Currently, kvrocks2redis will synchronize incremental data by default, but if the increment is too large, many data files will be written, we can consider adding a configuration that does not write incremental data, and only does full synchronization.

Are you willing to submit a PR?

zjregee commented 5 months ago

I'd like to try this.

git-hulk commented 5 months ago

@zjregee Thank you!

zjregee commented 4 months ago

Hello, @git-hulk. I'm running into some issues while trying to resolve this issue and hope to get your advice.

kvrocks2redis currently updates the data into a new Redis step by step by writing an aof file for all the data in the kvrocks, which is not reasonable when the kvrocks contains a large amount of data. In order to solve this problem, we can first create an rdb file, synchronize a large amount of data at once through the rdb file, and continue to synchronize some unsynchronized data through the original incremental synchronization method.

But I have doubts about how to pass the rdb file to the new Redis. Redis does not seem to support receiving rdb file directly through a certain command. If the new Redis is synchronized through slaveof, this new master-slave relationship does not seem to meet this requirement.

How should I solve this problem? In order for the Redis to receive the rdb file, I seem to need to send the rdb file to the data directory related to the new Redis and restart the Redis. I am not sure if this method is reasonable.

git-hulk commented 4 months ago

@zjregee Sorry for not getting back to you sooner.

How should I solve this problem? In order for the Redis to receive the rdb file, I seem to need to send the rdb file to the data directory related to the new Redis and restart the Redis. I am not sure if this method is reasonable.

Redis only supports loading the RDB from the replication for now, and it should be not good to require users to restart the Redis server for syncing the RDB.

kvrocks2redis currently updates the data into a new Redis step by step by writing an aof file for all the data in the kvrocks, which is not reasonable when the kvrocks contains a large amount of data.

For this scenario, perhaps we can send the key value to the target node directly instead of writing AOF.