microsoft / mimalloc

mimalloc is a compact general purpose allocator with excellent performance.
MIT License
10.53k stars 852 forks source link

Avoiding mi_cfree on mac interpose mode #313

Open thomcc opened 4 years ago

thomcc commented 4 years ago

I think you might just need to interpose malloc_default_zone too. This is what asan ends up doing, among other things (I suspect the other things it interposes are related to asan's functionality): https://github.com/llvm/llvm-project/blob/350fafabe9d3bda75e80bf077303eb5a09130b53/compiler-rt/lib/sanitizer_common/sanitizer_malloc_mac.inc#L85

You also can try __attribute__((constructor(0))) to try to get loaded earlier than with normal __attribute__((constructor)). Not sure if that matters for shared libs, probably not, but it can help for static linking.

This is a somewhat messy commit (includes unrelated whitespace changes, XXX comments, and doesn't reuse the existing interpose stuff) that does these and seems to work for me. https://github.com/thomcc/mimalloc/commit/52b5237028a27644ffc4e8a671d7b884a2fa7379

Unfortunately, IDK how to trigger the bad behavior that lead to you using mi_cfree in the first place, so IDK if it fixes it. That said, if you want I can PR it.

Still it's a bummer that doing this in a statically linked build isn't viable how it is on other unixes...

daanx commented 4 years ago

Awesome -- thanks so much. I just got myself a shiny mac mini so I can actually locally test and develop more easily now so I will try this out. It is a hassle on the mac, I wonder why they went to a different loading system and the zone system in the first place since the basic BSD/Linux loading works so nicely :-( ah well.

I forgot how to trigger the bad behavior but I remember it being in one of the mimalloc-bench benchmarks so I let you know how it goes.

daanx commented 3 years ago

Hi @thomcc -- thanks again for this and apologies for the delay; I took the commit above and integrated it -- I also updated all the benchmark suite to run on macOS and it seems to work fine now without the cfree check :-) Very nice.
Some benchmarking shows that mimalloc is often 2x faster that the system allocator (!) Thanks again