apache / kvrocks

Apache Kvrocks is a distributed key value NoSQL database that uses RocksDB as storage engine and is compatible with Redis protocol.
https://kvrocks.apache.org/
Apache License 2.0
3.53k stars 464 forks source link

[QUESTION] Should we use backup or checkpoint of rocksdb for kvrocks backup #220

Closed ShooterIT closed 3 years ago

ShooterIT commented 3 years ago

Currently, we use checkpoint of rocksdb to implement kvrocks full synchronization, but kvrocks backup still uses backup of rocksdb. As we know, rocksdb backup costs much bandwidth and space of disk, but rocksdb backup can implement incremental backup and support to store into HDFS, these are good features. So i think we should make a decision to use backup or checkpoint of rocksdb for kvrocks backup. WDYT? @git-hulk @karelrooted @Alfejik

git-hulk commented 3 years ago

It's really a good question, we expected users to use another disk to store the backup, but seems most users won't do that. So it would be fine only to support the checkpoint for kvrocks backup IMO.

ShooterIT commented 3 years ago

we expected users to use another disk to store the backup

Yes, i think users don't buy this idea, it is not easy to deploy and maintain. For remote backup, that is a good idea because our server originally is a persistent key value server and the backup should not be local. but actually HDFS is not common because some companies don't use it, they may just copy backup to other servers or remote storage.

Maybe it is good idea to support them together, although it is some bit trouble. Firstly, we can only support to use rocksdb checkpoint as kvrocks backup.

git-hulk commented 3 years ago

yeah, it would be better to support both if it didn't import too much complexity.

ShooterIT commented 3 years ago

For incremental backup, maybe https://github.com/bitleak/kvrocks/wiki/How-to-backup can give you a idea :)