dpc / rdedup

Data deduplication engine, supporting optional compression and public key encryption.
826 stars 43 forks source link

RFC: Builtin error correction #142

Open hirschenberger opened 6 years ago

hirschenberger commented 6 years ago

Would you also like to implement some error correction code like e.g. Reed-Solomon. There's already a mature crate in crates.io?

Bup ist doing it by just running the par2 tool over the archive, but I don't like this approach.

dpc commented 6 years ago

I think it should be doable. I guess it would a bottom-most layer that would write out the compressed and encrypted files with optional error correction, sure.

It could be neatly incorporated in the existing backend layer, so it would work transparently across all backends when writing out / reading in files.

I'm not sure how much is this worth from a practical perspective, but considering how natural it should be to add, feel free to discuss with other users on gitter and give it a shot.

Ralith commented 6 years ago

Previous discussion of this concluded that cloud backends already do error-correction, which would make the value of doing our own limited. However, if you're doing your own storage, this might still be useful: storing two replicas requires twice as much hardware but can still permanently lose data in the event of as few as two failures, whereas storing 2:1 error correction coding requires you to lose more than half of your storage, which is a dramatic improvement in robustness if your data is spread out among more than a couple failure boundaries.

aidanhs commented 4 years ago

@hirschenberger can you elaborate on why you don't like the bup approach?

Separately, I stumbled across https://crates.io/crates/blkar which may have some reusable components.

geek-merlin commented 3 years ago

I deem this very valuable.