andrewchambers / bupstash

Easy and efficient encrypted backups.
https://bupstash.io
MIT License
886 stars 31 forks source link

Add configurable robustness for block recovery #51

Open LarsKumbier opened 3 years ago

LarsKumbier commented 3 years ago

The current bupstash setup does not handle data corruption in the repository (e.g. through bitrot or storing the repository on mediums with a low MTBF). This could be done by adding an optional erasure code like Reed-Solomon-Codes or similar algorithms.

Since this is a trade between robustness and increase of repository size due to the parity data, the user should be able to choose the acceptable amount of data corruption per blocksize. Since a user might use different repositories with different storage mediums and different existing robustness levels in place, the robustness value should be set on a per-repository level.

Bupstash should also include a subcommand to check and repair a remote repository. Since RAID and tape storage systems use the term "scrub" for this, I suggest to use it here as the subcommand as well.

LarsKumbier commented 3 years ago

Darrenldl's implementation looks interesting in terms of testing, but seems to have a memory leak which should be fixed first.

andrewchambers commented 3 years ago

One difficulty I can see here is its harder to fix corruption in the repository sqlite3 database, however I suppose by probabilities, corruption is possibly more likely to occur in the data/* files so this is still valuable even without protections on the sqlite3 data.

LarsKumbier commented 3 years ago

That needs to be addressed nonetheless. Errors in the storage media can happen anywhere - including the database - with the same probability, so at least it should be tested and documented, what do in such a case. Is it possible to rebuild the database from the current live data, so that the error goes away after the next backup? What happens in the worst case, when the backup is needed and the database is corrupted?

andrewchambers commented 3 years ago

with the same probability ...

My point was If the data to metadata ratio is a million to one the probability of data corruption is actually higher. That being said, I agree and don't want to neglect this if it can be solved well.

edit:

Though its also possible the wear and tear from sqlite operations wears disks more, so really I don't know.

andrewchambers commented 3 years ago

Related to #97

andrewchambers commented 3 years ago

As a note here, the new repository format stores a redundant copy of the metadata for items, which should make us a lot more resistant to this problem. We are also more free to add erasure codes to this new format.