VowpalWabbit / vowpal_wabbit

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
https://vowpalwabbit.org
Other
8.49k stars 1.93k forks source link

introducing wyhash hash function #2159

Open wangyi-fudan opened 5 years ago

wangyi-fudan commented 5 years ago

Dear All: I would like to introduce wyhash (https://github.com/wangyi-fudan/wyhash) It is a 64-bit little endian portable hash function. It is the fastest one that passed all strict hash function tests (https://github.com/rurban/smhasher). I think vowpal_wabbit can make use of it to increase performance.

lokitoth commented 5 years ago

Hi @wangyi-fudan: Thanks for looking into hashing. I had a few additional questions for you:

Is the candidate hash function value-compatible with the existing algorithm? If not, this would be a model-breaking change, so we would need to stage it as a non-default option at first.

Also, have you had a chance to run some benchmarks comparing pre/post? As an example, take a look at the murmur2=>murmur3 discussion here: https://github.com/VowpalWabbit/vowpal_wabbit/wiki/murmur2-vs-murmur3

wangyi-fudan commented 5 years ago

hi, it is not value-compatible with existing hash function. but will be compatible with learning algorithms since the interface is the same. I do not use vw much, instead I do my own c++ coding for learning algorithms. The reason I develop wyhash is the heavy use of feature hashing in my own algorithm. It is designed to accelerate feature hashing.