alecthomas / mph

Minimal Perfect Hashing for Go
BSD 3-Clause "New" or "Revised" License
177 stars 23 forks source link

Set n as a multiple of the key set size m #7

Closed sqs closed 9 years ago

sqs commented 9 years ago

See discussion at https://github.com/alecthomas/mph/issues/6 for background. I was finding that mph failed to generate collision-free hashes for simple key sets. Digging into the paper revealed to me some differences between how it recommends setting m and what mph was doing. This PR makes mph choose m=len(keys) and then sets n as a multiple of m, instead of the previous way of setting n=len(keys) and then m=n/2. This PR makes the included test cases pass; without it, they often fail.

Again, I am not super familiar with MPHs or CHD although I did read the paper and have an undergraduate-level education in theoretical CS. So, please let me know if I've misunderstood anything.

alecthomas commented 9 years ago

So I guess, to summarise: