Infinite compression - Githubissues

Hello,

Oh I'm surprised people still use my software, I unfortunately couldn't finish migrating it to Python v3 although there's not much left to do, but missing time :-/

About your thought experiment, I think it's because you forget that you need to keep and use all the ECC files you generated along the way, and also that the Singleton Bound limits recovery to (n-k)/2, in other words 50% of the ECC length can be recovered.

Let's say you use an ECC code that with a length exactly as long as the number of characters it protects, eg, for 8 characters of the original file, the ECC generates 8 characters that can recover 4 characters. So now you can indeed delete 4 characters without loss. Then you re-encode the 4 remaining characters using the same ECC scheme, so it generates a 4 characters ECC that can recover 50% = 2 characters of the input file. You can then delete 2 characters and what remains are 2 characters.

In the end, you get the following:

First-level ECC file that is 8 characters long.
Second-level ECC file that is 4 characters long.
Purposely recursively tampered input file that is now 2 characters long.

With these 3 files, you can fully recover your complete original file in theory. But in total you are storing 8+4+2=14 characters, whereas your original file was 8 characters, so there is no compression at all.

Furthermore, it's not a more interesting alternative to first-level ECC code in terms of resiliency, because here despite the multi-level ECC code being shorter (14 characters vs 16 characters for initial file + first-level ECC code), here there is no resiliency against errors, if any characters of any of the 3 files of this multi-level ECC encoding gets corrupted, it's game over.

lrq3000 / pyFileFixity

Infinite compression #9