martinus / robin-hood-hashing

Fast & memory efficient hashtable based on robin hood hashing for C++11/14/17/20
https://gitter.im/martinus/robin-hood-hashing
MIT License
1.5k stars 142 forks source link

Fix memory leak #103

Closed BearishSun closed 3 years ago

BearishSun commented 3 years ago

Hello, I've recently integrated your map into my application and it has been working pretty great overall.

However I have a memory leak that I've narrowed down to the robin_hood map. It happens when I continually do the following:

  1. Insert some elements into a map
  2. clear()
  3. reserve(maximumNumberOfElementsEver)
  4. Repeat steps 1-3

This keeps adding new memory blocks to the memory pool and the memory use keeps growing. I realize the reserve() is not necessary but this was a remnant from when I used std::unordered_map() and others doing the switch might experience similar issues.

martinus commented 3 years ago

Hi, thanks for the report and the PR! Unfortunately that fix is not enough. E.g. when there already is data in the map, a repeated call to reserve() will still continuously allocate.

I've commited 628cfdd629035e997f181e63b1c62c8ac0caea29 where I change the logic of reserve(): It now only does something when the requested size is bigger than the current size in the hashmap. That way it will also work when data is already in the map.

rehash() still behaves as before, so repeated calls to rehash() will increase memory.

BearishSun commented 3 years ago

Great, thank you.

Note that my use-case was as follows: maximumNumElements = std::max(maximumNumElements, map.size()) map.clear() map.reserve(maximumNumElements) repeat...

Usually maximumNumElements will keep growing throughout the application lifetime. I imagine in this case we will continue to see a leak?

martinus commented 3 years ago

That's not a problem with the fix, reserve() now only mallocs when maximumNumElements is larger than the map's capacity

cculianu commented 3 years ago

Jesus christ are you kidding me? I have been plagued by this for a year in my server. Grr.

Thanks for fixing.

cculianu commented 3 years ago

Just to be clear i consider it a bug if the thing keeps groing memory every time rehash() is called -- if I am to understand correctly that still is the current behavior?

Really the map should not be reserving anything beyond what the caller specified.... one can imagine in a big codebase rehash would be called a lot for religious reasons...

Please clarify what is the current behavior now? Should we never be calling any of these methods? Can you clairfy?