jwerle / murmurhash.c

MurmurHash3 general hash bashed lookup function implementation
MIT License
77 stars 28 forks source link

Big endian machine fix #5

Open abalib opened 6 months ago

abalib commented 6 months ago

murmurhash.c produces wrong results on a big endian machine. I updated one line of code with htole32() function which is a no-op on x86 but does a 32-bit endian reversal on a big endian machine. I also added few more include paths to Makefile in case /usr/include/endian.h file is relocated elsewhere on some OS.

I did rudimentary testing with various string lengths, including odd lengths not multiples of 4. The results are the same on little and big endian tests.

BASELINE RESULTS ON A LITTLE ENDIAN, x86 SYSTEM
[abali@css-host-165:~/project/murmurhash.c$](mailto:abali@css-host-165:~/project/murmurhash.c$) uname -a
Linux css-host-165 5.4.0-169-generic #187-Ubuntu SMP Thu Nov 23 14:52:28 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
[abali@css-host-165:~/project/murmurhash.c$](mailto:abali@css-host-165:~/project/murmurhash.c$) git log | head -1
commit ffaacccda0f647a99806160fe90db4b012219603
[abali@css-host-165:~/project/murmurhash.c$](mailto:abali@css-host-165:~/project/murmurhash.c$) echo -n abcdefgh | ./murmur
1239272644
[abali@css-host-165:~/project/murmurhash.c$](mailto:abali@css-host-165:~/project/murmurhash.c$) echo -n abcdefgha | ./murmur
1732915885
[abali@css-host-165:~/project/murmurhash.c$](mailto:abali@css-host-165:~/project/murmurhash.c$) echo -n abcdefghab | ./murmur
2688697735
[abali@css-host-165:~/project/murmurhash.c$](mailto:abali@css-host-165:~/project/murmurhash.c$) echo -n abcdefghabc | ./murmur
1789602127
[abali@css-host-165:~/project/murmurhash.c$](mailto:abali@css-host-165:~/project/murmurhash.c$) echo -n abcdefghabcd | ./murmur
1971538362

WRONG RESULT OBTAINED ON A BIG ENDIAN SYSTEM
(base) abali@linux0d:~/project/murmurhash$ echo -n abcdefgh | ./murmur
3619961239
(base) abali@linux0d:~/project/murmurhash$ echo -n abcdefgha | ./murmur
1473211031
(base) abali@linux0d:~/project/murmurhash$ echo -n abcdefghab | ./murmur
3082355046
(base) abali@linux0d:~/project/murmurhash$ echo -n abcdefghabc | ./murmur
1590627740
(base) abali@linux0d:~/project/murmurhash$ echo -n abcdefghabcd | ./murmur
1037389485
(base) abali@linux0d:~/project/murmurhash$ uname -a
Linux linux0d 6.6.2 #1 SMP Mon Nov 27 17:55:46 EST 2023 s390x s390x s390x GNU/Linux

PATCHED CODE TESTED ON THE BIG-ENDIAN SYSTEM, VERIFY THAT THE RESULT IS SAME AS THE BASELINE
(base) abali@linux0d:~/project/murmurhash$ echo -n abcdefgh | ./murmur
1239272644
(base) abali@linux0d:~/project/murmurhash$ echo -n abcdefgha | ./murmur
1732915885
(base) abali@linux0d:~/project/murmurhash$ echo -n abcdefghab | ./murmur
2688697735
(base) abali@linux0d:~/project/murmurhash$ echo -n abcdefghabc | ./murmur
1789602127
(base) abali@linux0d:~/project/murmurhash$ echo -n abcdefghabcd | ./murmur
1971538362

PATCHED CODE TESTED ON THE LITTLE ENDIAN SYSTEM X86, VERIFY THAT THE RESULT IS SAME AS ABOVE>
abali@css-host-165:~/project/murmurhash$ echo -n abcdefgh | ./murmur
1239272644
abali@css-host-165:~/project/murmurhash$ echo -n abcdefgha | ./murmur
1732915885
abali@css-host-165:~/project/murmurhash$ echo -n abcdefghab | ./murmur
2688697735
abali@css-host-165:~/project/murmurhash$ echo -n abcdefghabc | ./murmur
1789602127
abali@css-host-165:~/project/murmurhash$ echo -n abcdefghabcd | ./murmur
1971538362
abali@css-host-165:~/project/murmurhash$ uname -a
Linux css-host-165 5.4.0-169-generic #187-Ubuntu SMP Thu Nov 23 14:52:28 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux