tomerfiliba-org / reedsolomon

⏳🛡 Pythonic universal errors-and-erasures Reed-Solomon codec to protect your data from errors and bitrot. Includes a future-proof zero-dependencies pure-python implementation 🔮 and an optional speed-optimized Cython/C extension 🚀
http://pypi.python.org/pypi/reedsolo
Other
351 stars 86 forks source link

The project description is inaccurate #53

Closed mgorny closed 1 year ago

mgorny commented 1 year ago

I'm sorry for being picky but the description right now says:

Pure-Python Reed Solomon encoder/decoder

However, the default install includes a C module, so that's not really "Pure-Python" and could lead to confusion. Would it be fine to change that to just "Python Reed Solomon encoder/decoder"?

lrq3000 commented 1 year ago

Thank you for your suggestion. The C module is entirely optional and will always be, and on top it's transpiled from a statically typed python like *.pyx module, hence the description. The pure python version will remain the gold standard, the C implementation is only a bonus for those who want to get some more speed. The documentation already quite clearly specify the existence of the C module, and the fact it is complimentary.

lrq3000 commented 1 year ago

Also for context, all other "Python Reed-Solomon codecs" are actually just wrappers around non-python implementations, so that you can't just edit the Python code, and the Python code is not sufficient (and in fact not doing much of the maths).

Hence, to formalize my thought process, I consider as "Pure-Python" any Python package that does not require external non-Python tools to function. With this definition, which I believe is widely agreed on by other old time pythonistas, the outputs of a Python package do not matter. For example, I don't think anyone regard a Python package that generates DLLs or binaries, such as scripts entry points, as "non pure python". I think this is similar to a transpiled Cythonized extension.

And this definition is not an adhoc post rationalization, I use it for every other Python package I have published so far.

mgorny commented 1 year ago

I'm afraid I have to disagree about the common use of "pure Python". At least what I've been taught is that "pure Python" means "no C code involved, so it won't segfault on me" (without getting into the details on how this is actually wrong ;-)).

mgorny commented 1 year ago

A word of context to my use of the term: Linux distributions often treat Python packages that don't include any non-Python code as "noarch", i.e. roughly equivalent to "purelib" wheels. On the other hand, packages that do install C extensions need to be built and tested on every architecture separately (i.e. "platlib"). Having "pure Python" in the description here has led me to initially wrongly believe this is the package of the former category.

lrq3000 commented 1 year ago

Thank you for your explanation.

Would the following rewriting of the description be OK?

A pythonic universal errors-and-erasures Reed-Solomon Codec to protect your data from errors and bitrot. It includes a pure python implementation and an optional speed-optimized Cython/C extension.

This is a burst-type implementation, so that it supports any Galois field higher than 2^3, but not binary streams. Burst errors are non-random errors that more often happen on data storage mediums such as hard drives, hence this library is better suited for data storage protection, and less for streams noise correction, although it also works for this purpose but with a bit of overhead (since it works with bytes only, instead of bits).

Based on the wonderful tutorial at Wikiversity, written by "Bobmath" and "LRQ3000". If you are just starting with Reed-Solomon error correction codes, the Wikiversity article is a good beginner's introduction.

lrq3000 commented 1 year ago

Please read the issue on GitHub, I have edited the description I have previously suggested.

mgorny commented 1 year ago

Yes, that is much more descriptive, thank you.

lrq3000 commented 1 year ago

Thank you for your feedback and your patience. It's now updated :-) I won't push to PyPi since no code has changed, but it will be pushed with the next bugfix! Have a great day and Merry Christmas!