google / tcmalloc

Apache License 2.0
4.39k stars 476 forks source link

HugePageFiller, how does it do what it does? #172

Closed deadalnix closed 4 months ago

deadalnix commented 1 year ago

After a quick email exchange with @ckennelly , he suggested I open an issue about this, so that there is a public trail of the conversation (good call).

So I've been wondering about the implementation of the huge page aware allocator . While the paper explains really well what the code does and why, the how is a bit more mysterious.

So let me ask a few specific question that would help going through the code to understand it.

The allocator organize itself around huge page. When it needs memory from these pages it prioritize using the existing huge pages that already have a lot of allocation, as statistically, they are less likely to to be returned to the OS, and therefore we want to cram as much as possible in there. Another critical parameter is the longest range of unallocated slots withing the huge page as we want to find a huge page with a range of unallocated memory that is big enough, but also generally want to avoid splitting large range of unallocated memory.

It is unclear to me how the selection process works in practice. As far as i can tell, HugePageFiller is responsible for this. It seems to manage an ungodly amount of freelists (512 possible free range length, 64 priority classes). Is it really that simple? Considering there are lists for partially released and releases huge pages too, as well as donated ones, this makes for a ungodly amount of freelists, no?

Isn't it a problem that the huge page are only selected based on their longest run? A page with a high score (as in, unlikely to become empty) and a large free range (plus other smaller free ranges) will not see its smaller free range being allocated before lower priority huge block run out of small free ranges (if they don't have large one). Isn't it a problem? Or is the correlation between "no large free range" and "unlikely to deallocate" good enough so that it works in practice?

Last, but probably not least, I don't understand the return path. When some allocation is freed, it goes through this magical box that is the middle end is completely opaque to me and i can't find a good explanation of what it does and why.

ckennelly commented 1 year ago

HugePageFiller consulting a bunch of freelists is (approximately) all it does. Few notes, though:

In terms of longest free run, this is discussed a bit in our paper, section 4.4:

In terms of the return path: We're primarily using the address-indexed radix tree (pagemap) to look up information about what huge page an allocation came from. As we finished up the allocation during HugePageAwareAllocator::LockAndAlloc, we updated the radix tree with the info we need for deallocation.

deadalnix commented 1 year ago

@ckennelly That is very useful information! Thanks!