Oldes / Rebol-issues

Issue tracker for https://github.com/oldes/Rebol3
4 stars 0 forks source link

CRC-32 checksum is negative integer (why not BINARY! or positive?) #2376

Open Siskin-Bot opened 4 years ago

Siskin-Bot commented 4 years ago

Submitted by: Hostilefork

The CRC32 exposed in R3-Alpha returned a signed integer.

r3-alpha>> checksum/method #{AE} 'crc32   
== -479436446

Red does the same (though they are limited to 32-bit signed INTEGER!, so they don't have a choice to be unsigned, or they could only represent half the CRC values...)

red>> checksum #{AE} 'crc32  ; no /METHOD refinement (required)
== -479436446

The more common concept of CRC-32 is unsigned. But as the fact that code has worked regardless, what really matters is generally the bytes....because things that check CRC-32 are typically decoding streams and have to be sensitive to big endian / little endian.

Other common checksum types return BINARY!:

>> checksum/method #{ABCD} 'md5
== #{7838496FD0586421BBB500BB6F472F13}

>> checksum/method #{ABCD} 'sha1
== #{32825EB98DE842EE3E4DF005A07B7D65522A46A0}

So it seems doing that for CRC32 as a 4 byte binary would dodge concerns of the integer representation. But it seems (unfortunately) no one has standardized the byte order of transmission for CRC-32. Two places we use it are little endian:

If it has to be an integer, it would seem that it should be unsigned. Note Python2's integer (like Red and Rebol2's integer) was limited to a signed 32-bit value, but Python3 has bignums for arbitrary precision integer. So it switched to signed:

https://stackoverflow.com/questions/32940417/unsigned-crc-32-for-python-to-match-javas-crc-32


Imported from: https://github.com/rebol/rebol-issues/issues/2375

Comments:

Oldes commented on Sep 19, 2019:

I would keep it how it is.. or what is the problem with signed integer? Returning binary may look better in console session, but it is wasting resources.. also the conversion to unsigned integer is operation which may be unnecessary and one can convert the signed integer to unsigned easily if it is needed.


Oldes commented 4 years ago

I'm closing it... I prefer to be compatible with Red language.

Oldes commented 9 months ago

Maybe all checksum results could be unified... currently there is:

>> foreach m system/catalog/checksums [ printf [10] reduce [m mold/flat checksum "a" m] ]
adler32   6422626
crc24     9657536
crc32     -390611389
md4       #{BDE52CB31DE33E46245E05FBDBD6FB24}
md5       #{0CC175B9C0F1B6A831C399E269772661}
ripemd160 #{0BDC9D2D256B3EE9DAAE347BE6F4DC835A467FFE}
sha1      #{86F7E437FAA5A7FCE15D1DDCB9EAEAEA377667B8}
sha224    #{ABD37534C7D9A2EFB9465DE931CD7055FFDB8879563AE98078D6D6D5}
sha256    #{CA978112CA1BBDCAFAC231B39A23DC4DA786EFF8147C4E72B9807785AFEE48BB}
sha384    #{54A59B9F22B0B80880D8427E548B7C23ABD873486E1F035DCE9CD697E85175033CAA88E6D57BC35EFAE0B5AFD3145F31}
sha512    #{1F40FC92DA241694750979EE6CF582F2D5D7D28E18335DE05ABC54D0560E0F5302860C652BF08D560252AA5E74210546F369FBBBCE8C12CFC7957B2652FE9A75}
sha3-224  #{9E86FF69557CA95F405F081269685B38E3A819B309EE942F482B6A8B}
sha3-256  #{80084BF2FBA02475726FEB2CAB2D8215EAB14BC6BDD8BFB2C8151257032ECD8B}
sha3-384  #{1815F774F320491B48569EFEC794D249EEB59AAE46D22BF77DAFE25C5EDC28D7EA44F93EE1234AA88F61C91912A4CCD9}
sha3-512  #{697F2D856172CB8309D6B8B97DAC4DE344B549D4DEE61EDFB4962D8698B7FA803F4F93FF24393586E28B5B957AC3D1D369420CE53332712F997BD336D09AB02A}
xxh3      #{E6C632B61E964E1F}
xxh32     #{550D7456}
xxh64     #{D24EC4F1A98C6E5B}
xxh128    #{A96FAF705AF16834E6C632B61E964E1F}
tcp       65438