tomerfiliba-org / reedsolomon

ā³šŸ›” Pythonic universal errors-and-erasures Reed-Solomon codec to protect your data from errors and bitrot. Includes a future-proof zero-dependencies pure-python implementation šŸ”® and an optional speed-optimized Cython/C extension šŸš€
http://pypi.python.org/pypi/reedsolo
Other
358 stars 86 forks source link

C implementation doesn't work with larger messages. #29

Closed doino-gretchenliev closed 1 year ago

doino-gretchenliev commented 3 years ago

I'm using the following code:

import creedsolo as rs
prim = rs.find_prime_polys(c_exp=12, fast_primes=False, single=True)
rs.init_tables(prim=prim, c_exp=12)

The error thrown is:

    c_gf_log, c_gf_exp, c_field_charac = rs.init_tables(prim=prim, c_exp=12)
  File "creedsolo.pyx", line 203, in creedsolo.init_tables
OverflowError: value too large to convert to unsigned char

I'm trying to encode a file with 4095 sized chunks without using the additional 255 internal splittings. I've precalculated the polynomial numbers. So far I was able to encode, but not to decode the message:

import creedsolo as rs
rs.init_tables()

mesecc = rs.rs_encode_msg(data, nsym, gen=gen[nsym]) # nsym = 123, len(data) = 3972
rmes, rmesecc, errata_pos = rs.rs_correct_msg(mesecc, nsym)

The error thrown is:

    rmes, rmesecc, errata_pos = rs.rs_correct_msg(mesecc, nsym)
  File "creedsolo.pyx", line 694, in creedsolo.rs_correct_msg
ValueError: Message is too long (4095 when max is 255)

Is there a way to encode 4095 chunks without hidden splitting?

lrq3000 commented 1 year ago

Maybe related to #44

lrq3000 commented 1 year ago

Ok so this is an inherent limitation because the C implementation only works with bytearrays, and bytearrays only support characters up to 255 IIRC. If you want to use higher galois fields, you need to use the pure python version, or rewrite the C implementation to use lists instead of bytearrays (which will be MUCH slower so this defeats the purpose and you are better off simply using the pure python version under PyPy).

About the lack of chunking, this is an issue also reported in #44, I will maybe merge the fix offered there.

lrq3000 commented 1 year ago

Ah no there is no chunking anyway when you directly call the functions without the RSCodec object, it's only RSCodec that can do automatic chunking, not the functions.