za3k / qr-backup

Paper backup of files using QR codes
Other
64 stars 9 forks source link

Allow restore with some QR codes missing or damaged (parity) #2

Closed za3k closed 1 year ago

za3k commented 3 years ago

This would essentially require qr-backup --restore... there's no easy command-line programs to do this. Try to make it restorable without qr-backup if there's no damage.

Edit: could also support shamir's secret sharing scheme

za3k commented 3 years ago

Looks like most python libraries only support up to 255 codes -- that's a 35k file at default settings, much too small. May have to roll my own erasure coding, yuck.

za3k commented 3 years ago

Example problem from the branch where I was working on this: https://github.com/tomerfiliba/reedsolomon/issues/34

EmperorArthur commented 3 years ago

I've been working on this on my own implementation.

There's an easy solution that already exists. Par2 is a recovery file which generates Reed Solomon codes. It's a packetized format, so you can zero pad each packet to fit in a QR code and the programs will still read them.

The downside is that you need to be aware that recovery packets are 64 bytes larger compared to the recovery block size.

If you don't care about exact alignment, you can just use Par2 as is. If you do, I am currently working on Par2Py, which will allow for alignment and for using an optional packet type that stores the data itself inside the Par2 file itself.

The advantage of this approach is if you do one Par2 packet per QR code, then not only can you recover from some codes missing, but you can also recover from codes being scrambled!

za3k commented 3 years ago

Thanks a lot for sharing! I'll add your project to the list of related projects.

I think I would rather not bring in a dependency on the par2 format, or the par2 command line tool unless it provides some large benefit. In this case, the obvious benefit would be allowing restore with missing QR codes, without 'qr-backup' present (the biggest reason I was avoiding this feature to start with is that it would require qr-backup for restore). I'll have to think about whether I can write a bash oneliner to do what you're describing.

za3k commented 3 years ago

OK, I'm pulling out par2 discussion to its own issue.

za3k commented 2 years ago

Reed-solomon may hit limits around either 1000 or 1000000 codes. Look at LT codes, an arguably simpler and more scalable alternative. https://github.com/Spriteware/lt-codes-python . Alternatively, try interleaved reed-solomon, but this has complicated layout constraints

https://github.com/Spriteware/lt-codes-python

za3k commented 1 year ago

Done. Interleaved not supported, instead we go with simple blocking. Sadly this means you can't say something like "you can lose any 30% of codes and restore".