Open MatthewCushing opened 5 years ago
Most hashes are seeded, and that seed is transformed into an internal state before hashing. So for instance a hash might take a 128 bit seed, and then transform it into a 256 bit state, and hashing proceeds against the state. Sometimes this operation is expensive. My version of smhasher separates this into two steps. The first takes a seed and sets up the state, and the second uses that state to hash. Consider djb2 hash, it does not take a seed, and initializes its 32-bit state to 5381. If it were seeded then it would allow you to set the state explicitly via a seed. So for instance it might do:
state = seed ^ 5381;
or something similar. Maybe it might take a 16 bit seed and then do something like:
state = ((seed ^ 123) << 16 | (seed ^ 5381));
so that it ensures the state does not start as zero.
out is the return value.
So I was able to get a variation of the FNV1a tests working by adding it in which was great. But honestly, I'm not 100% sure I've put it in correctly since I'm not sure what is getting passed in. If we look at one of your implementations:
The big thing that I'm not understanding is what is going on here:
Am I correct that this is initially a value of 0? Thus XOR'ing it is doing nothing but keeping its original state that we have casted and giving it the constant value here?
This is a version of it we have implemented:
The way I put it into SMHasher is:
Or another has we've implemented is Djb2, here's our original implementation:
And here's how I was implementing it into SMHasher:
Am I getting this right or am I getting this completely wrong? I'd apprecitate any help, I'm still in the process of getting familiar with hash functions, the terminology, and how it all comes together.
Thanks!
EDIT: I have a feeling I'm not putting these in right as I am getting some pretty big failures via the SMHasher tests.
For the version of the FNV1a that we used I got - Failed 107 of 195 tests For the version of Djb2 that we used I got - Failed 160 of 195 tests
I'm running your version of FNV1a to see if it's similar but I have a feeling it won't be.
EDIT2: Welp, I accidentally exited out of test running for your version of FNV1a but the amount of time it was taking to implement it vs our version was vastly different. I will run it again tomorrow when I can but I don't have time before I get off work. In the end, I feel as if I am not understanding the arguments correctly that I need to implement into the hash function. It would be great if you could explain what each argument is doing.
From what I understand: key = is the string that is going to be hashed len = the length of the key state = (not sure exactly) out = essentially the return value?
EDIT3: Well it seems the tests are similar but I'd love some confirmation that I'm understanding this correctly.