edawson / mkmh

Generate kmers/minimizers/hashes/MinHash signatures, including with multiple kmer sizes.
MIT License
24 stars 2 forks source link

Incorporate xxHash and ntHash as optional hashing algorithms #3

Open edawson opened 5 years ago

edawson commented 5 years ago

MurmurHash, while relatively fast / dispersive / backwards compatible with Mash, is slower than some newer algorithms. Moving to xxHash should yield a ~2X speed improvement in the hashing portions of the code, and ntHash should go even faster.

luizirber commented 5 years ago

Wearing my sourmash hat here: I think we can bring this up with Mash and support ntHash, there is a field for specifying what hash is being used in the mash JSON schema.

But before going crazy on supporting every hash function under the sun it would be useful to have some agreement to keep everything compatible.

P.S: we also would need to agree on ntHash 1 or 2, see discussion here