Hello everybody! Especially for long-term archival, it seems important to me that we account for bit rot. There are two ways:
Whenever one detects that a stored block has a wrong hash, re-request the block from the network.
Advantage: less space needed
Disadvantage: What if you thought the file is safe because you pinned it but actually you are the only one holding the data?
Store 1% / 0.1% of redundancy with Reed-Solomon forward correcting codes.
Advantage: simple, for example just use par2cmdline (Parchive) or use the implementation from BlackBlaze.
Advantage: less network traffic because you can flip the bit again yourself.
Disadvantage: additional space needed.
Hello everybody! Especially for long-term archival, it seems important to me that we account for bit rot. There are two ways:
Whenever one detects that a stored block has a wrong hash, re-request the block from the network. Advantage: less space needed Disadvantage: What if you thought the file is safe because you pinned it but actually you are the only one holding the data?
Store 1% / 0.1% of redundancy with Reed-Solomon forward correcting codes. Advantage: simple, for example just use par2cmdline (Parchive) or use the implementation from BlackBlaze. Advantage: less network traffic because you can flip the bit again yourself. Disadvantage: additional space needed.
Related: https://en.wikipedia.org/wiki/Cooperative_storage_cloud#Data_redundancy What do you think?