See discussion at https://github.com/alecthomas/mph/issues/6 for background. I was finding that mph failed to generate collision-free hashes for simple key sets. Digging into the paper revealed to me some differences between how it recommends setting m and what mph was doing. This PR makes mph choose m=len(keys) and then sets n as a multiple of m, instead of the previous way of setting n=len(keys) and then m=n/2. This PR makes the included test cases pass; without it, they often fail.
Again, I am not super familiar with MPHs or CHD although I did read the paper and have an undergraduate-level education in theoretical CS. So, please let me know if I've misunderstood anything.
See discussion at https://github.com/alecthomas/mph/issues/6 for background. I was finding that mph failed to generate collision-free hashes for simple key sets. Digging into the paper revealed to me some differences between how it recommends setting
m
and what mph was doing. This PR makes mph choosem=len(keys)
and then setsn
as a multiple ofm
, instead of the previous way of settingn=len(keys)
and thenm=n/2
. This PR makes the included test cases pass; without it, they often fail.Again, I am not super familiar with MPHs or CHD although I did read the paper and have an undergraduate-level education in theoretical CS. So, please let me know if I've misunderstood anything.