Closed patlecat closed 2 years ago
It mostly improves on my old implementation by much more likely to be bug free 🙂 robin_hood still has the chance of overflow with bad hash, even with some mitigations. This new map practically doesn't have that issue. By splitting up it's internal structure into an index and a data vector the code became much simpler, and I'm now able to easily support custom allocators.
robin_hood still has the chance of overflow with bad hash
Are there details available on this? E.g. when does it hit and what exactly happens then (e.g. exception thrown)?
@greenrobot It can happen when >127 entries per bucket are used. If that happens robin_hood tries to mitigate the problem by increasing the map's size or changing the hash. If that doesn't help, an std::overflow_error
is thrown. It never happened for me, but I had user reports that this has happened.
With unordered_dense there can be 16.7 million entries per bucket.
Thanks, so this also highly depends on the quality/flaws of the hash function; assuming that there's one bucket per hash(?).
yes it depends on the hash.
@martinus Ok thanks. Is that the only difference? So has the performance also improved in comparison to the old lib? And what about rerun of your benchmark? Would love to see the current state of things.
@patlecat I'll release a completely redone benchmark in a few days.
@patlecat I'll release a completely redone benchmark in a few days.
omg omg omg Awesome man, thanks so much!
yes it depends on the hash.
Sorry, one more questions: any prediction on how uint64_t keys behave, that are incremented monotonically?
@greenrobot depends a lot. E.g. if it increases by 1, then just use a vector with the key as an index. If it doesn't, use a proper hash.
@patlecat @greenrobot I've finally published my big benchmark update: https://www.reddit.com/r/cpp/comments/x82kac/comprehensive_c_hashmap_benchmarks_2022/?
@martinus Wooooow! Amazing work! I'm eager to play with it now... :grin:
I'm eagerly following your new lib. I hope you will explain more soon how it improves on all your previous hash_map libs, I'm not clear about this yet.
Also I hope you will make a new benchmark round again like in 2019, this was awesome and gave so much insight in all the libs out there.
Thanks :)