scylladb / scylla-manager

The Scylla Manager
https://manager.docs.scylladb.com/stable/
Other
48 stars 33 forks source link

Cheksum comparison #3885

Open karol-kokoszka opened 2 weeks ago

karol-kokoszka commented 2 weeks ago

Fixes https://github.com/scylladb/scylla-manager/issues/3827

This PR adds explicit stage to the backup process - deduplication. The purpose of this stage is to compare every SSTable .crc32 with their equivalent in remote storage. If contents are the same, then all SSTable files of given ID are removed from the snapshot dir and won't be transferred to remote.

It is done to mitigate the problem with current deduplication/checksum comparison which causes SM to create a checksum of every SSTable file and to compare it with the checksum from remote.

SSTable .crc32 keeps the checksum of SSTable data file and is created by Scylla server.

This PR changes the default configuration of scylla-manager-agent.yaml as well to exclude check_sum comparison from RClone.


Please make sure that:

karol-kokoszka commented 2 weeks ago

Integration tests are failing. To be checked.