tomerfiliba-org / reedsolomon

⏳🛡 Pythonic universal errors-and-erasures Reed-Solomon codec to protect your data from errors and bitrot. Includes a future-proof zero-dependencies pure-python implementation 🔮 and an optional speed-optimized Cython/C extension 🚀
http://pypi.python.org/pypi/reedsolo
Other
351 stars 86 forks source link

How to use for Bit Correction #40

Closed fspider closed 1 year ago

fspider commented 2 years ago

Data length = 64 bit (data) + 32 bit (RS padding) = 96 bit.

Current code is trying to correct in Byte Unit. But I want to correct in bit unit so that it can correct as many bit as possible.

Thank you!

haneefmubarak commented 2 years ago

NOTE: I'm not associated with this library in particular, but perhaps this may be informative for your problem:

This library corrects burst-type errors, not raw binary — which is fairly typical in terms of how RS codes are commonly used (there are of course cases where binary stream RS is more common).

With that being said, the straightforward way to attempt to maximize correctability would be to maximize the number of code words (ie: select the smallest bit sized code that will work). Let's try different bit sizes iteratively:

Given 4 codewords of the 8 bit sized code as codable correction words, you should be able to correct up to n/2 = 4/2 = 2 unknown symbol errors or detect up to n = 4 symbol errors.

However, if you are willing to slightly relax your requirements, you could improve this slightly while maintaining the same coding rate (albeit at minor loss of encode/decode speed). For instance, if you encoded and decoded two messages together, you could effectively double the number of correctable messages: 16 data codewords (128 bits) + 8 correction codewords (64 bits) = 24 total codewords (192 bits), which would allow you to correct up to n/2 = 8/2 = 4 unknown symbol errors or detect up to n = 8 symbol errors.

You can take this to it's logical limit by encoding and decoding floor(255/12) = 21 messages together (again, with some increasing in encoding/decoding complexity resulting in decreased speed): 218 = 168 data codewords (1,344 bits) and 214 = 84 correction codewords (672 bits) yields 252 total codewords (2,016 bits), which would allow you to correct up to n/2 = 84/2 = 42 unknown symbol errors or detect up to n = 42 symbol errors.

I hope this is helpful, lmk :)

lrq3000 commented 1 year ago

To complement @haneefmubarak 's excellent answer: in a nutshell, convert your bits into bytes, and then for RSCodec just divide by 8 your design: 64 bit (data) + 32 bit (RS padding) = (n-k: 8) + (k: 4) = 12 Bytes. Where n-k is the original message length (without error correction code), and k the error correction code length (also called nsym in this module).