Closed julien-lebot closed 10 years ago
Choice of prime number seems to be very important. If too small hash collisions are numerous. Even the 1000th prime number does not seem to be enough to prevent simple test input to collide.
printf("%u\n", f_htable_generate_key(2, "AA")); printf("%u\n", f_htable_generate_key(2, "AC")); printf("%u\n", f_htable_generate_key(113, "AA")); printf("%u\n", f_htable_generate_key(113, "Fb")); printf("%u\n", f_htable_generate_key(127, "AA")); printf("%u\n", f_htable_generate_key(127, "Fp")); printf("%u\n", f_htable_generate_key(7919, "AA")); printf("%u\n", f_htable_generate_key(7919, "@Q")); printf("%u\n", f_htable_generate_key(7919, "zzzzzzzz@test.com")); printf("%u\n", f_htable_generate_key(7919, "aaaaaaaa@test.com")); printf("zzzzzzzzzzzzzzzz@test.com %u\n", f_htable_generate_key(7919, "zzzzzzzzzzzzzzzz@test.com")); printf("aaaaaaaaaaaaaaaa@test.com %u\n", f_htable_generate_key(7919, "aaaaaaaaaaaaaaaa@test.com"));
I suggest as a temporary fix (as it's not a real solution) to change the bitwise shift up to:
ret = (ret << 1) + str[i];
This will saturate the hash slower and will generate more entropy with the addition. It's not a solution because it will still eventually saturate.
Now you can use your own hash function. Are you right with this ?
Choice of prime number seems to be very important. If too small hash collisions are numerous. Even the 1000th prime number does not seem to be enough to prevent simple test input to collide.
I suggest as a temporary fix (as it's not a real solution) to change the bitwise shift up to:
This will saturate the hash slower and will generate more entropy with the addition. It's not a solution because it will still eventually saturate.