lrq3000 / pyFileFixity

📂🛡️Suite of tools for file fixity (data protection for long term storage⌛) using redundant error correcting codes, hash auditing and duplications with majority vote, all in pure Python🐍
MIT License
129 stars 9 forks source link

RAID1 correction #6

Open Piskvor opened 5 years ago

Piskvor commented 5 years ago

RAID 1 is mirroring one disk with a bit-by-bit copy of another disk.

This is by convention only: the marginal utility of an additional disk drops rapidly, therefore no COTS solutions above 2 disks. I run 3-disk RAID-1 arrays, exactly for correcting errors on n-1 disks (also, if one disk in a 2-disk array fails, the other one practically tends to fail soon after, whatever the reason - from suddenly bearing the whole load? from being similar in age? from being from the same production batch?).

More-disks RAID 1 is merely impractical for archival, with its costly requirement for disk redundancy: it kind of works, but it's not the right tool for the job (as opposed to availability for currently-used data).

However, your point with "no detection of silent corruption" has merit. I suggest an addition to the RAID 1 paragraph:

While it's possible to have multiple disks in a RAID 1 array, you are paying a multiple of the storage price, with the same storage capacity as with a single disk, without a commensurate increase in resilience. In other words, not very efficient.

lrq3000 commented 5 years ago

Hello @Piskvor , thank you for the suggestion, this will be added :-) This was indeed the point, thank you for clarifying!

In this set of tools, both mirroring and error correction methods are provided at the file level, because people tend to use mirroring a lot (it's easier to put in place), but indeed error correction is a lot more efficient but more complex, this tool set was aimed at making the process simpler :-)