Nicoretti / crc

Pure Python CRC library
https://nicoretti.github.io/crc/
BSD 2-Clause "Simplified" License
53 stars 13 forks source link

Add support for arbitrary input and output bit widths #26

Open Nicoretti opened 2 years ago

Nicoretti commented 2 years ago

Relevant other work (see also #25)

smarek commented 2 years ago

@Nicoretti Since I'm afraid this will be hard task without breaking current interfaces, i've re-implemented your interfaces and instead of operating on Byte (custom) class i'm using exclusively bitarray, i've also strong-typed all the interfaces (functions, return indices, ...) and added some asserts to be a bit more comfortable with the code

I currently have no intention on releasing this work as standalone crc library, but if you'd be interested in using it, or just as inspiration, feel free to use it anyhow

re-implementation here: https://github.com/OK-DMR/ok-dmrlib/blob/master/okdmr/dmrlib/etsi/crc/crc.py

This also addresses and solves the original #25 report, which was right, and i'm afraid not even the crccheck project from Martin Scharrer is capable of doing this, even crccheck operates on bytes (not bitstreams) and those references are only CRC's that operate on bytes and produce non-8bit-sized output

My re-implementation should provide for any arbitrary sized input, but i've yet to build a better test-suite to catch some corner-cases, current test-suite is here: https://github.com/OK-DMR/ok-dmrlib/blob/master/okdmr/tests/dmrlib/etsi/crc/test_crc9.py

All the best!

Nicoretti commented 2 years ago

Hi @smarek,

thanks for coming back to me and providing "insperation" -> appreciate it ;D.

This also addresses and solves the original #25 report, which was right, and i'm afraid not even the crccheck project from Martin Scharrer is capable of doing this, even crccheck operates on bytes (not bitstreams) and those references are only CRC's that operate on bytes and produce non-8bit-sized output

I think resticting the implementation to bytes (for the input) is mainly done so the implementation can use lookup tables to improve the performance. Though I think one also could implement a hybrid which feeds 8 bit (1 Byte) chunks where possible and use lookup tables for that, while dealing with chunks smaller than 8 bit and the edge cases in a bit manner.

Maybe for starters I could add your implementation as an alternative if non 8 bit sized output and/or a bitwise input is needed.

So Thanks a lot, also all the best for you!

smarek commented 2 years ago

@Nicoretti cheers and thanks mate, in crc-9 use-case table-based lookup works pretty slick, only the lookup table generator function might not be written well enough for all crc sizes (since it currently creates table not on 0-255 but on 0-511 range)

also from my perspective since bitarray is written in C/C++ and already provides all the bitwise operations, it's pretty easy to use and might be worth looking into replacing the Byte class basis (integers) with bitarray

needs more testing, but hey, if you're happy enough, i'm as well :)

Nicoretti commented 2 years ago

@Nicoretti cheers and thanks mate, in crc-9 use-case table-based lookup works pretty slick, only the lookup table generator function might not be written well enough for all crc sizes (since it currently creates table not on 0-255 but on 0-511 range)

good point(s).

also from my perspective since bitarray is written in C/C++ and already provides all the bitwise operations, it's pretty easy to use and might be worth looking into replacing the Byte class basis (integers) with bitarray

That's nice, for this library though some of my major design decisions have been:

  1. do not use any dependencies (except python stdlib) -> portability + all the benefits you get if you don't have any dependencies
  2. provide a pure python implementation -> portability, ease of use
  3. Ideally keep it as single module (python file) -> ease of use, if the consuming project does not want to have dependencies one can just copy it into the project

I need to think about the pros and cons

smarek commented 2 years ago

I actually just extended the test-suite, comparing CRC16(CCIT) and CRC32 in usage, with small improvement over reversing output bytes, they result in same output

However for obvious reasons, table based implementations are currently far-more-inferior to bit-by-bit processing And handling data of length not divisible by crc-size currently fails as well

Also i probably have similar point-of-view on using dependencies, however i had to settle on "do not add unnecessary dependencies", thus allowing myself to use bitarray library, since it's widely supported, used and tested, but for anything that can be solved without library, or the library itself is so tiny/insignificant, i guess it's better to use (or write from scratch) the module internally

thank you

smarek commented 2 years ago

Well update, i just got it working for arbitrary bit-sized input (see https://github.com/OK-DMR/ok-dmrlib/commit/840291e33efbe436b677f3dfc19821e92f528243 ), and lookup tables are created with optimal size (for crc[1-15] it's self size, for crcs like [16,32,...] it's greatest possible divisor, eg. byte for crc16/crc32, 9 bits for crc18, etc.), and that's it, it's fully working, compatible with your current crc implementation (both table and processing byte-by-byte where applicable), and test contains sanity check, that our libraries provide same output for same input+algorithm

If you'd have anything in mind to change/update, let me know, but I consider current state stable enough to be used world-wide For my peace of mind, i currently have nothing more, to add or fix/optimize

Nicoretti commented 2 years ago

nice, thx for sharing