karanlyons / murmurHash3.js

MurmurHash3, in JavaScript.
MIT License
195 stars 54 forks source link

hash128 not matching python #8

Closed maneeshsahu closed 4 years ago

maneeshsahu commented 4 years ago

Hi @karanlyons I am not getting the hash128 of the this library to match the python mmh3.

Python mmh3: hex(mmh3.hash128("I will not buy this tobacconist's, it is scratched.")))

Yields: 0x67d73523f0079673d30654abbd8227e3

But in your readme: murmurHash3.x64.hash128("I will not buy this tobacconist's, it is scratched.");

Yields: d30654abbd8227e367d73523f0079673

Why is there a mismatch?

karanlyons commented 4 years ago
$ g++ ./mmh3_reference_repl.cpp; and ./a.out
> 
00000000000000000000000000000000
> I will not buy this tobacconist's, it is scratched.
d30654abbd8227e367d73523f0079673

There is a mismatch because the mmh3 library for Python is incorrect. Specifically it swaps the order of the two final uint64_ts relative to the reference implementation.

I could not tell you why it does this, but if you can’t find a Python implementation/binding that is accurate against the reference implementation you can simply swap the 64 bits around:

>>> o = mmh3.hash128("I will not buy this tobacconist's, it is scratched.")
>>> hex(((o & 0xffffffffffffffff) << 64) + (o >> 64))
'0xd30654abbd8227e367d73523f0079673'

Taking the (unwitting) test vectors from https://github.com/aappleby/smhasher/issues/73#issuecomment-527887962:

 N |                            Bytes | MM3-128 (x64) Reference          | murmurHash3.x64.hash128
---|----------------------------------|----------------------------------|---------------------------------
 1 |                               00 | 4610abe56eff5cb551622daa78f83583 | 4610abe56eff5cb551622daa78f83583
 2 |                             0000 | 3044b81a706c5de818f96bcc37e8a35b | 3044b81a706c5de818f96bcc37e8a35b
 3 |                           000000 | 79d54dd1bf7137480af5e7f1b766291d | 79d54dd1bf7137480af5e7f1b766291d
 4 |                         00000000 | cfa0f7ddd84c76bc589623161cf526f1 | cfa0f7ddd84c76bc589623161cf526f1
 5 |                       0000000000 | 3df460ff3e17b53a17874fba56e69767 | 3df460ff3e17b53a17874fba56e69767
 6 |                     000000000000 | 7d480f9fa80ec469719af4070b74d89d | 7d480f9fa80ec469719af4070b74d89d
 7 |                   00000000000000 | f402c55ac5dec98f2de586f681711c02 | f402c55ac5dec98f2de586f681711c02
 8 |                 0000000000000000 | 28df63b7cc57c3cbf2557dfcc4e8fe52 | 28df63b7cc57c3cbf2557dfcc4e8fe52
 9 |               000000000000000000 | 73269217e5476f20f1fa3fc86728ca0c | 73269217e5476f20f1fa3fc86728ca0c
10 |             00000000000000000000 | 5b3d684f8c57ce161ba63bef94931146 | 5b3d684f8c57ce161ba63bef94931146
11 |           0000000000000000000000 | 056e0d6c8921404673c2da0104c39955 | 056e0d6c8921404673c2da0104c39955
12 |         000000000000000000000000 | a4d8ece9d7c0dfe3803bbf8eb6f0853f | a4d8ece9d7c0dfe3803bbf8eb6f0853f
13 |       00000000000000000000000000 | a10ea8b22762995abb1575409cfb7dc6 | a10ea8b22762995abb1575409cfb7dc6
14 |     0000000000000000000000000000 | 028b7708fcbbed1e8393f0698afe46ea | 028b7708fcbbed1e8393f0698afe46ea
15 |   000000000000000000000000000000 | 6ce113b115a56871195953c2230f8db2 | 6ce113b115a56871195953c2230f8db2
16 | 00000000000000000000000000000000 | 4bbd1bf27da918d6b465a9eccd791cb6 | 4bbd1bf27da918d6b465a9eccd791cb6

 N |                            Bytes | MM3-128 (x86) Reference          | murmurHash3.x86.hash128
---|----------------------------------|----------------------------------|---------------------------------
 1 |                               00 | 88c4adec54d201b954d201b954d201b9 | 88c4adec54d201b954d201b954d201b9
 2 |                             0000 | 04a872bbedcd774bedcd774bedcd774b | 04a872bbedcd774bedcd774bedcd774b
 3 |                           000000 | e0d93642acf40e87acf40e87acf40e87 | e0d93642acf40e87acf40e87acf40e87
 4 |                         00000000 | cc066f1f9e5178409e5178409e517840 | cc066f1f9e5178409e5178409e517840
 5 |                       0000000000 | 50a68ecfd01a6609d01a6609d01a6609 | 50a68ecfd01a6609d01a6609d01a6609
 6 |                     000000000000 | 777fa95660bde92360bde92360bde923 | 777fa95660bde92360bde92360bde923
 7 |                   00000000000000 | 0d45d85efb848988fb848988fb848988 | 0d45d85efb848988fb848988fb848988
 8 |                 0000000000000000 | e028ae414772b0844772b0844772b084 | e028ae414772b0844772b0844772b084
 9 |               000000000000000000 | 5ad58a7e543371085433710854337108 | 5ad58a7e543371085433710854337108
10 |             00000000000000000000 | 64010da262e8bc1762e8bc1762e8bc17 | 64010da262e8bc1762e8bc1762e8bc17
11 |           0000000000000000000000 | 2f35ebd169f8166569f8166569f81665 | 2f35ebd169f8166569f8166569f81665
12 |         000000000000000000000000 | 332d18d156b5986456b5986456b59864 | 332d18d156b5986456b5986456b59864
13 |       00000000000000000000000000 | 583cbe60ca53c80fca53c80fca53c80f | 583cbe60ca53c80fca53c80fca53c80f
14 |     0000000000000000000000000000 | a8e046b5855ca909855ca909855ca909 | a8e046b5855ca909855ca909855ca909
15 |   000000000000000000000000000000 | 3553d0af909796639097966390979663 | 3553d0af909796639097966390979663
16 | 00000000000000000000000000000000 | 5a4075d66b2d3d27d3926c2feb228a07 | 5a4075d66b2d3d27d3926c2feb228a07