Open monoxgas opened 5 years ago
Also, for quick reference, the CRC-32 section of the current framing2
spec: https://github.com/google/snappy/blob/e9e11b84e629c3e06fbaa4f0a86de02ceb9d6992/framing_format.txt#L39
The plot thickens.
Upon further inspection, I believe the implementation here is indeed correct and it's the Mozilla repo which is at fault. I also erroneously claimed that python-snappy
disagrees with the CRC checksums currently being computed, however they do indeed match.
Specifically, the formal spec of CRC-32/ISCSI
(CRC-32C
) states that the initialization value should be 0xffffffff
/~0
and that the final value should be complemented : x ^ 0xffffffff
or x = ~x
. This is exactly what is happening here:
https://github.com/kubo/snzip/blob/809c6f2db74bdd2ccfc34f4dc30adf0338161af3/crc32.h#L15
References:
http://reveng.sourceforge.net/crc-catalogue/all.htm
... init=0xffffffff ... xorout=0xffffffff
However this still creates a small pickle, as there is now technically another "flavour" for this single compression format. Given that the difference is so small, I would hesitate to create an entirely new format such as mozilla-framing
, but whatever fits the project best. An alternative solution might be to compare both the correct checksum and it's complement. It's technically inaccurate, but might save some headache and should never cause any real issues.
I made a small program to compare the results of the implementation with CRC examples.
$ git clone https://github.com/kubo/snzip.git
$ cd snzip
$ ./autogen.sh
$ ./configure
$ wget https://gist.githubusercontent.com/kubo/c6bc9f038e3dd66bfb37da4f8596d119/raw/b97e2a7d325fc15edbc149a036f0605f79a2e22d/crc32_test.c
...omitted...
$ gcc -o crc32_test crc32_test.c crc32.c crc32_sse4_2.c -msse4.2
$ ./crc32_test
32 bytes of zeroes: OK
32 bytes of ones: OK
32 bytes of incrementing 00..1f: OK
32 bytes of decrementing 1f..00: OK
An iSCSI - SCSI Read (10) Command PDU: OK
The results are same with the CRC examples.
As for your two ideas, I prefer the former. However I postpone my decision for a while.
https://github.com/kubo/snzip/blob/809c6f2db74bdd2ccfc34f4dc30adf0338161af3/crc32.h#L15
It would appear that while calculating the masked CRC in the function above, you perform a bitwise NOT operator on the result before masking. I'm unsure what formats require this, if any (possibly
snappy-in-java
), but this creates a mismatched CRC for most other public implementations offraming2
. Specifically:python-snappy
library : https://github.com/andrix/python-snappy/blob/3c9cad5e82c48416c930e70ae21cc7a0ee693a6f/snappy/snappy.py#L67I was hesitant to make a PR until I verified which formats might require it, but it should be an easy fix (I'd be happy to do it). Possibly two different functions, one which inverts and one which does not.