Open landaire opened 3 months ago
Would you like to try to hint 32k aligned allocations of 32k size?
There may be an additional gap here that the memory is marked as addressable up to the nearest page boundary even if you did not mmap() with a len % PAGE_SIZE == 0. The memory will always be poisoned with the value 0, meaning that ASAN will never detect a memory violation except via the SIGSEGV deadly signal handler if the memory has been munmap()'d.
With mmap interceptors we have no goal to detect anything. This is outside of C/C++, so it up to the user if they want to poison for detection. E.g. you can implement poisoning in custom allocator.
The goal of theses interceptors avoid false positives when we mmap memory region which was somehow left out poisoned.
Would you like to try to hint 32k aligned allocations of 32k size?
I can, but what outputs are you looking for?
The goal of theses interceptors avoid false positives when we mmap memory region which was somehow left out poisoned.
This makes sense. Is it intentional to leave the shadow memory for the mapping around? FWIW I don't know much about the shadow memory internals, so maybe it's not so simple to just make it go away
I can, but what outputs are you looking for?
Asan is 1 to 8 mapping. 4k of mmap is just 4k/8 of shadow. if can not do anything about 1/8 of the page. I believe asan uses NOT_NEEDED on full pages, 32k should result in full shadow pages.
Context
While working on some changes to one of our fuzzers, we observed that the fuzzer was failing our health check and was quickly exceeding our 5GB RSS limit.
We eventually narrowed it down to our new allocator. The harness crashed when the new allocator was used but RSS plateaued when using malloc.
The new allocator uses
mmap()
with a random page hint to ensure that there's a very low likelihood of the same page being returned twice when allocating. This is to ensure that when the memory is deallocated and wemunmap()
it, any UAFs on that address range will fail fast.We also realized that this only occurred when running under ASAN but did not occur under other sanitizers like MSAN.
The Bug
Summary
If you
mmap()
a lot of memory and thosemmap()
calls frequently return new addresses (such as through the address hint), the shadow memory for that address is not cleaned up after anmunmap()
.There may be an additional gap here that the memory is marked as addressable up to the nearest page boundary even if you did not
mmap()
with alen % PAGE_SIZE == 0
. The memory will always be poisoned with the value 0, meaning that ASAN will never detect a memory violation except via the SIGSEGV deadly signal handler if the memory has beenmunmap()
'd.Full Details
I had a guess that maybe the OOM had something to do with the allocator's
mmap
page hint and produced the following MRE which clearly showed that when a page hint is used we crash. When NULL is provided for the page hint the memory usage mostly plateaus. In both cases this behavior only surfaces when compiled with ASAN enabled:Compiling without ASAN did not reproduce this behavior.
I then put together this other example which logged addresses returned from
mmap()
, which showed that the same address was being returned thousands of times when the page hint is NULL and was likely masking a bug that only occurs when a different address is returned:Here is what I think is happening:
The sanitizers work by hooking various functions, including
mmap()
andmunmap()
. Themunmap()
code path poisons the memory in the same exact manner asmmap()
which is a bit suspicious: https://github.com/llvm/llvm-project/blob/ccdce045e22b9d36cc4f41a5e35f6006c9c0fba5/compiler-rt/lib/asan/asan_interceptors.cpp#L152-L179PoisonShadow()
will callFastPoisonShadow()
which in turn attempts to allocate shadow memory for this memory range: https://github.com/llvm/llvm-project/blob/ccdce045e22b9d36cc4f41a5e35f6006c9c0fba5/compiler-rt/lib/asan/asan_poisoning.h#L70. When using the random page hint, you're almost always going to get a brand new memory range returned frommmap()
so ASAN will allocate more shadow memory to describe this memory range.munmap()
just callsPoisonShadow()
and the shadow mapping is written back to the poison value (0), but the shadow mapping is never released.The third point is kind of interesting because we don't even really get any value of poisoning with 0. If you attempt to read from this address you receive a SIGSEGV and ASAN just says it doesn't know anything about this address. ASAN also marks the entire
[addr, RoundToNearestPageBoundary(addr+size))
range as addressable, so if you request less thanPAGE_SIZE
bytes and read beyondaddr + size
the "out of bounds" read isn't detected. Technically this memory is addressable, but I'd consider reading the slack between size and page boundary a logic bug. This is especially true whenmmap
ing files.Assuming I didn't miss something obvious, I confirmed that the
munmap()
callsPoisonShadow()
Which just does a memest()
And then we call
munmap()
and do nothing else with the shadow memory.Mitigating
I couldn't find anything about this on the ASAN bug tracker but thought it was worth highlighting. I'm not sure exactly how to go about fixing this in ASAN as I'm not sure if shadow memory was ever meant to be removed, but it's probably possible.
For our case we're able to mitigate the issue by disabling the
mmap()
page hint in the allocator when ASAN is enabled: