Closed nicklegr closed 1 year ago
Hi! The benefit is that I don't have to check each element when distance is equal, I only have to check until the fingerprint is too large. So when you have e.g. 10 elements that hash to the same bucket and I want to find an element that hashes to the same bucket but is not there, on average I only have to check 5 of them (until dist_and_fingerprint
is too large). Without that sorting, I would have to compare all 10 fingerprints.
As far as I know no robin hood hashing except my implementations use this behavior.
I see! I understand it well.
I think there is a slight performance penalty to insert
by inserting the element at a slightly more forward position than intended.
But if find
is faster, I think it's a good trade-off.
At least in my application, the performance of find
is more important.
Thank you for your clear answer!
I am reading your source code. It is very cleverly designed and interesting. Great work!
I have one question. In your implementation of
insert
, the expressiondist_and_fingerprint <= rhs.m_dist_and_fingerprint
appears. This works the same as the general Robin Hood hashing when dist is different. However, when dist is equal, it may be inserted before the intended position, depending on the value of fingerprint. Is there any benefit to this behavior other than simplifying the implementation?I understand that this works correctly. For
insert
it isdist_and_fingerprint <= rhs.m_dist_and_fingerprint
, forfind
it isdist_and_fingerprint > rhs.m_dist_and_fingerprint
, which is symmetrical.