Aeledfyr / deepsize

A rust crate to find the total size of an object, on the stack and on the heap
MIT License
103 stars 19 forks source link

Update DeepSizeOf for HashMap for the new hashbrown implementation #12

Closed Aeledfyr closed 4 years ago

Aeledfyr commented 5 years ago

If anyone knows how hashbrown's hashmap allocates memory, can you give me pointers? Is it like the old implementation, where it was effectively the same as a Vec<(u64, K, V)>?

Aeledfyr commented 4 years ago

I believe that the implementation is now more correct, but it isn't perfect. I think hashbrown uses an internal array of buckets [(K, V)] with some control bytes at the start and end; I can't figure out the exact number of buckets, but .capacity() should be good enough, and I can't figure out the size of the control byte area.

droundy commented 4 years ago

Why not use a custom allocations to monitor allocations in the test suite? Then you wouldn't have to guess whether you guessed correctly.

Aeledfyr commented 4 years ago

Using a custom allocator to check the actual size would work, but it wouldn't give the same answers as the recursive method. While it would be more accurate, the original purpose of this library was to estimate the size of the structures without hooking into the allocator, as an alternative to heapsize. Since heapsize is no longer maintained I may look into an alternate implementation that uses a custom allocator, but that is significantly more complex than the current system.

There often isn't enough information exposed through the public API to determine the exact allocated size, so a custom allocator could only tell how far off the calculation was in a specific case. In this case, the only exposed information about the allocation is .capacity(), which isn't enough to figure out the exact number of buckets.

It would still be good as some sort of benchmark to test the average error for specific data structures, so I'll look into it. If you know of any resources for setting up a custom allocator for monitoring, I would appreciate it.

droundy commented 4 years ago

I do something like this in tinyset. It looks like I use stats_alloc, but it's been a while since I've looked at this code.