Open Siskin-Bot opened 4 years ago
I'm closing it... I prefer to be compatible with Red language.
Maybe all checksum results could be unified... currently there is:
>> foreach m system/catalog/checksums [ printf [10] reduce [m mold/flat checksum "a" m] ]
adler32 6422626
crc24 9657536
crc32 -390611389
md4 #{BDE52CB31DE33E46245E05FBDBD6FB24}
md5 #{0CC175B9C0F1B6A831C399E269772661}
ripemd160 #{0BDC9D2D256B3EE9DAAE347BE6F4DC835A467FFE}
sha1 #{86F7E437FAA5A7FCE15D1DDCB9EAEAEA377667B8}
sha224 #{ABD37534C7D9A2EFB9465DE931CD7055FFDB8879563AE98078D6D6D5}
sha256 #{CA978112CA1BBDCAFAC231B39A23DC4DA786EFF8147C4E72B9807785AFEE48BB}
sha384 #{54A59B9F22B0B80880D8427E548B7C23ABD873486E1F035DCE9CD697E85175033CAA88E6D57BC35EFAE0B5AFD3145F31}
sha512 #{1F40FC92DA241694750979EE6CF582F2D5D7D28E18335DE05ABC54D0560E0F5302860C652BF08D560252AA5E74210546F369FBBBCE8C12CFC7957B2652FE9A75}
sha3-224 #{9E86FF69557CA95F405F081269685B38E3A819B309EE942F482B6A8B}
sha3-256 #{80084BF2FBA02475726FEB2CAB2D8215EAB14BC6BDD8BFB2C8151257032ECD8B}
sha3-384 #{1815F774F320491B48569EFEC794D249EEB59AAE46D22BF77DAFE25C5EDC28D7EA44F93EE1234AA88F61C91912A4CCD9}
sha3-512 #{697F2D856172CB8309D6B8B97DAC4DE344B549D4DEE61EDFB4962D8698B7FA803F4F93FF24393586E28B5B957AC3D1D369420CE53332712F997BD336D09AB02A}
xxh3 #{E6C632B61E964E1F}
xxh32 #{550D7456}
xxh64 #{D24EC4F1A98C6E5B}
xxh128 #{A96FAF705AF16834E6C632B61E964E1F}
tcp 65438
Submitted by: Hostilefork
The CRC32 exposed in R3-Alpha returned a signed integer.
Red does the same (though they are limited to 32-bit signed INTEGER!, so they don't have a choice to be unsigned, or they could only represent half the CRC values...)
The more common concept of CRC-32 is unsigned. But as the fact that code has worked regardless, what really matters is generally the bytes....because things that check CRC-32 are typically decoding streams and have to be sensitive to big endian / little endian.
Other common checksum types return BINARY!:
So it seems doing that for CRC32 as a 4 byte binary would dodge concerns of the integer representation. But it seems (unfortunately) no one has standardized the byte order of transmission for CRC-32. Two places we use it are little endian:
PKZIP spec says, *"All values MUST be stored in little-endian byte order unless otherwise specified in this document for a specific data element."
...but despite little endian winning in many hardware areas, people still seem skeptical of saying any standard has won.
If it has to be an integer, it would seem that it should be unsigned. Note Python2's integer (like Red and Rebol2's integer) was limited to a signed 32-bit value, but Python3 has bignums for arbitrary precision integer. So it switched to signed:
https://stackoverflow.com/questions/32940417/unsigned-crc-32-for-python-to-match-javas-crc-32
Imported from: https://github.com/rebol/rebol-issues/issues/2375
Comments:
I would keep it how it is.. or what is the problem with signed integer? Returning binary may look better in console session, but it is wasting resources.. also the conversion to unsigned integer is operation which may be unnecessary and one can convert the signed integer to unsigned easily if it is needed.