Closed jmbnyc closed 3 years ago
Hi jmbnyc, actually the hash has not changed between 3.8.0 and 3.9.0. hash_bytes
and hash_int
are still the same. In between I experimented with CRC but got rid of the changes because of maintenance hassle, and also because I can't benchmark my changes properly on the different machines. On my intel machine CRC implementation was faster than what's in it now, but I had reports that it was slower on their end.
What actually has changed is that I now use the hash a bit differently, I use the lower few bits for the info byte. There shouldn't be much effect on runtime though, and this works better when the higher hash bits are not random enough.
Martin, Thanks for the quick response. You are correct, I forgot that I had grabbed the HEAD 3 weeks ago and then I called it 3.8.1 because I saw that 3.8.0 was the last tag.
Thanks for the explanation of how you use the bits - that works great for all aligned pointers where there is no existence of entropy in the lows bits, e.g. they are all zero.
BTW, I've added a few API's to your core map/set because I find them more useful than the std defined API. I added find per blow and a similar one for erase where I want to get a copy of the value and have erase indicate to me if it was erased (that one can actually use move to obtain the value).
e.g.
// JMB Change 6 Oct 2020
// Added conditional complilation to allow "set" and "map" to compile
template
Again, thanks for the excellent work you have produced.
Hm, I personally find the standard API pretty sufficient. E.g. I like this idom, which is pretty similar to your sample code:
if (auto it = map.find(key); it != map.end()) {
std::cout << "found " << it->second << std::endl;
} else {
std::cout << "not found" << std::endl;
}
I had a question about the latest changes to the hash functions (3.8.0 vs 3.9.0). Can you elaborate on the changes and their potential performance impact (or direct to me where this is discussed). Your work here is wonderful but your commit comments are not very helpful wrt understanding the changes that you are making (it would be nice if you added a description of what you have changed and why it was changed otherwise it is hard to know the motivation unless it is linked to an issue).
It is easy to see that you removed CRC stuff probably because of the maintenance hassle but I'd like to understand the impact on performance (cost of hashing and distribution impacts).
Thanks.