Closed plokhotnyuk closed 4 years ago
The problem is that xxHash on the JVM isn't, in the implementations I've tested, faster than MurmurHash3. I haven't explored MetroHash, but it's 64 bits (at a minimum) anyway, and it's CPU-specific; I'm not sure how well it performs on alternate architectures (e.g. AMD).
If someone wanted to do the work of benchmarking the latest hash variants and got promising results we could switch again, but last time we did this MurmurHash2 (actually--3 wasn't out yet) was the best available, all things considered. (Including better than xxHash at the time.)
My testing in 2016 showed that xxHash was much faster than MurmurHash3 on the JVM.
Now Zero-allocation Hashing includes implementations for murmur3, xx, metroHash, and new algorithm wyHash: somebody could compare them (I didn't benchmark the latest two).
Note that according to the latest version of SMHasher, MurmurHash3, xxHash, and MetroHash have some quality problems. Only wyHash and FarmHash don't, among the algorithms implemented in Zero-allocation Hashing.
Zero-allocation Hashing 's wyhash implementation use 64X64 mul which is slow on 32 bit machine. Please note that WYHASH32 define enables fast 32 bit code. Test and choose a suitable one by yourself.
While this is a relevant and important topic, scala/bug is not the right place to discuss it. Feel free to open a thread on contributors.
Link to the xxHash release which suports xxh3: https://github.com/Cyan4973/xxHash/releases/tag/v0.7.0
Link to the current implementation of Mumur hashing in Scala: https://github.com/scala/scala/blob/2.12.x/src/library/scala/util/hashing/MurmurHash3.scala#L181-L201