[LLD] Use-after-free with COFF directives from LTO objects

mstorsjo commented 9 months ago

When linking LTO objects that contain COFF directives (in particular, dllexports), use of those directives can end up with use-after-free.

This seems to be triggerable when the LTO object has been compiled with an older LLVM version of the compiler, than the version used in LTO.

To reproduce:

$ cat main.c 
void __declspec(dllexport) entry(void) { }
$ ~/clang-16.0/bin/clang -target x86_64-windows-msvc -c main.c -flto
$ lld-link main.o -entry:entry -out:main.exe -subsystem:console
=================================================================
==627895==ERROR: AddressSanitizer: heap-use-after-free on address 0xffff90f0385f at pc 0x000000b7330c bp 0xffffe1d1f5c0 sp 0xffffe1d1f5b8
READ of size 4 at 0xffff90f0385f thread T0
[...]
    #11 0xd695dc in insert /home/martin/code/llvm-project/llvm/include/llvm/ADT/DenseMap.h:228:12
    #12 0xd695dc in lld::coff::LinkerDriver::fixupExports() /home/martin/code/llvm-project/llvm/tools/lld/COFF/DriverUtils.cpp:708:21
    #13 0xd1cd14 in lld::coff::LinkerDriver::linkerMain(llvm::ArrayRef<char const*>) /home/martin/code/llvm-project/llvm/tools/lld/COFF/Driver.cpp:2547:5
[...]
0xffff90f0385f is located 63 bytes inside of 68-byte region [0xffff90f03820,0xffff90f03864)
freed by thread T0 here:
    #0 0xab0090 in free (/home/martin/code/llvm-project/llvm/build-asan/bin/lld+0xab0090)
    #1 0x3c9bf7c in llvm::lto::InputFile::~InputFile() /home/martin/code/llvm-project/llvm/lib/LTO/LTO.cpp:538:23
    #2 0xed86fc in operator() /usr/bin/../lib/gcc/aarch64-linux-gnu/9/../../../../include/c++/9/bits/unique_ptr.h:81:2
    #3 0xed86fc in ~unique_ptr /usr/bin/../lib/gcc/aarch64-linux-gnu/9/../../../../include/c++/9/bits/unique_ptr.h:292:4
    #4 0xed86fc in lld::coff::BitcodeCompiler::add(lld::coff::BitcodeFile&) /home/martin/code/llvm-project/llvm/tools/lld/COFF/LTO.cpp:167:3
[...]
previously allocated by thread T0 here:
    #0 0xab02fc in malloc (/home/martin/code/llvm-project/llvm/build-asan/bin/lld+0xab02fc)
    #1 0xb58874 in safe_malloc /home/martin/code/llvm-project/llvm/include/llvm/Support/MemAlloc.h:26:18
    #2 0xb58874 in llvm::SmallVectorBase<unsigned long>::grow_pod(void*, unsigned long, unsigned long) /home/martin/code/llvm-project/llvm/lib/Support/SmallVector.cpp:143:15
    #3 0xb50908 in grow_pod /home/martin/code/llvm-project/llvm/include/llvm/ADT/SmallVector.h:141:11
    #4 0xb50908 in grow /home/martin/code/llvm-project/llvm/include/llvm/ADT/SmallVector.h:529:41
    #5 0xb50908 in reserve /home/martin/code/llvm-project/llvm/include/llvm/ADT/SmallVector.h:669:13
    #6 0xb50908 in void llvm::SmallVectorImpl<char>::resizeImpl<false>(unsigned long) /home/martin/code/llvm-project/llvm/include/llvm/ADT/SmallVector.h:632:11
    #7 0x79045d4 in resize /home/martin/code/llvm-project/llvm/include/llvm/ADT/SmallVector.h:642:30
    #8 0x79045d4 in upgrade(llvm::ArrayRef<llvm::BitcodeModule>) /home/martin/code/llvm-project/llvm/lib/Object/IRSymtab.cpp:403:13
    #9 0x7903aac in llvm::irsymtab::readBitcode(llvm::BitcodeFileContents const&) /home/martin/code/llvm-project/llvm/lib/Object/IRSymtab.cpp
    #10 0x78fcd8c in llvm::object::readIRSymtab(llvm::MemoryBufferRef) /home/martin/code/llvm-project/llvm/lib/Object/IRObjectFile.cpp:146:46
    #11 0x3c9c23c in llvm::lto::InputFile::create(llvm::MemoryBufferRef) /home/martin/code/llvm-project/llvm/lib/LTO/LTO.cpp:543:35
[...]

The reason is that the export directives are stored as StringRef pointing at the source memory. In the case of LTO objects that have been upgraded, this is memory owned by SmallVector<char, 0> Strtab here: https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/include/llvm/LTO/LTO.h#L119-L125 After compiling LTO, this object is destructed, and the StringRef is left with a dangling pointer. For LTO objects that didn't need to be upgraded (see https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/Object/IRSymtab.cpp#L404-L438), the StringRef seems to point into memory elsewhere (maybe part of a memory map?) which isn't destructed at that time.

This issue is downstream issue https://github.com/mstorsjo/llvm-mingw/issues/392.

CC @rnk @MaskRay

llvmbot commented 9 months ago

@llvm/issue-subscribers-lld-coff

Author: Martin Storsjö (mstorsjo)

When linking LTO objects that contain COFF directives (in particular, dllexports), use of those directives can end up with use-after-free. This seems to be triggerable when the LTO object has been compiled with an older LLVM version of the compiler, than the version used in LTO. To reproduce: ```console $ cat main.c void __declspec(dllexport) entry(void) { } $ ~/clang-16.0/bin/clang -target x86_64-windows-msvc -c main.c -flto $ lld-link main.o -entry:entry -out:main.exe -subsystem:console ================================================================= ==627895==ERROR: AddressSanitizer: heap-use-after-free on address 0xffff90f0385f at pc 0x000000b7330c bp 0xffffe1d1f5c0 sp 0xffffe1d1f5b8 READ of size 4 at 0xffff90f0385f thread T0 [...] #11 0xd695dc in insert /home/martin/code/llvm-project/llvm/include/llvm/ADT/DenseMap.h:228:12 #12 0xd695dc in lld::coff::LinkerDriver::fixupExports() /home/martin/code/llvm-project/llvm/tools/lld/COFF/DriverUtils.cpp:708:21 #13 0xd1cd14 in lld::coff::LinkerDriver::linkerMain(llvm::ArrayRef<char const*>) /home/martin/code/llvm-project/llvm/tools/lld/COFF/Driver.cpp:2547:5 [...] 0xffff90f0385f is located 63 bytes inside of 68-byte region [0xffff90f03820,0xffff90f03864) freed by thread T0 here: #0 0xab0090 in free (/home/martin/code/llvm-project/llvm/build-asan/bin/lld+0xab0090) #1 0x3c9bf7c in llvm::lto::InputFile::~InputFile() /home/martin/code/llvm-project/llvm/lib/LTO/LTO.cpp:538:23 #2 0xed86fc in operator() /usr/bin/../lib/gcc/aarch64-linux-gnu/9/../../../../include/c++/9/bits/unique_ptr.h:81:2 #3 0xed86fc in ~unique_ptr /usr/bin/../lib/gcc/aarch64-linux-gnu/9/../../../../include/c++/9/bits/unique_ptr.h:292:4 #4 0xed86fc in lld::coff::BitcodeCompiler::add(lld::coff::BitcodeFile&) /home/martin/code/llvm-project/llvm/tools/lld/COFF/LTO.cpp:167:3 [...] previously allocated by thread T0 here: #0 0xab02fc in malloc (/home/martin/code/llvm-project/llvm/build-asan/bin/lld+0xab02fc) #1 0xb58874 in safe_malloc /home/martin/code/llvm-project/llvm/include/llvm/Support/MemAlloc.h:26:18 #2 0xb58874 in llvm::SmallVectorBase<unsigned long>::grow_pod(void*, unsigned long, unsigned long) /home/martin/code/llvm-project/llvm/lib/Support/SmallVector.cpp:143:15 #3 0xb50908 in grow_pod /home/martin/code/llvm-project/llvm/include/llvm/ADT/SmallVector.h:141:11 #4 0xb50908 in grow /home/martin/code/llvm-project/llvm/include/llvm/ADT/SmallVector.h:529:41 #5 0xb50908 in reserve /home/martin/code/llvm-project/llvm/include/llvm/ADT/SmallVector.h:669:13 #6 0xb50908 in void llvm::SmallVectorImpl<char>::resizeImpl<false>(unsigned long) /home/martin/code/llvm-project/llvm/include/llvm/ADT/SmallVector.h:632:11 #7 0x79045d4 in resize /home/martin/code/llvm-project/llvm/include/llvm/ADT/SmallVector.h:642:30 #8 0x79045d4 in upgrade(llvm::ArrayRef<llvm::BitcodeModule>) /home/martin/code/llvm-project/llvm/lib/Object/IRSymtab.cpp:403:13 #9 0x7903aac in llvm::irsymtab::readBitcode(llvm::BitcodeFileContents const&) /home/martin/code/llvm-project/llvm/lib/Object/IRSymtab.cpp #10 0x78fcd8c in llvm::object::readIRSymtab(llvm::MemoryBufferRef) /home/martin/code/llvm-project/llvm/lib/Object/IRObjectFile.cpp:146:46 #11 0x3c9c23c in llvm::lto::InputFile::create(llvm::MemoryBufferRef) /home/martin/code/llvm-project/llvm/lib/LTO/LTO.cpp:543:35 [...] ``` The reason is that the export directives are stored as `StringRef` pointing at the source memory. In the case of LTO objects that have been upgraded, this is memory owned by `SmallVector<char, 0> Strtab` here: https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/include/llvm/LTO/LTO.h#L119-L125 After compiling LTO, this object is destructed, and the `StringRef` is left with a dangling pointer. For LTO objects that didn't need to be upgraded (see https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/Object/IRSymtab.cpp#L404-L438), the `StringRef` seems to point into memory elsewhere (maybe part of a memory map?) which isn't destructed at that time. This issue is downstream issue https://github.com/mstorsjo/llvm-mingw/issues/392. CC @rnk @MaskRay

rnk commented 9 months ago

As you say, yes, the assumption here was that the StringRef should refer to the memory mapped object file, or a stable copy created in the export parsing code, which lives forever.

Would it be sufficient to stabilize the COFF linker options when we read them in, since LTO object lifetime is shorter than Export lifetime? Say, here: https://github.com/llvm/llvm-project/blob/main/lld/COFF/InputFiles.cpp#L1084C41-L1084C41

mstorsjo commented 9 months ago

As you say, yes, the assumption here was that the StringRef should refer to the memory mapped object file, or a stable copy created in the export parsing code, which lives forever.

Would it be sufficient to stabilize the COFF linker options when we read them in, since LTO object lifetime is shorter than Export lifetime? Say, here: https://github.com/llvm/llvm-project/blob/main/lld/COFF/InputFiles.cpp#L1084C41-L1084C41

It would be sufficient to e.g. wrap that in a StringSaver::save() operation, yeah. For most LTO object files, the StringRef seems to not refer to the SmallVector<char, 0> Strtab but apparently somewhere in the memory mapped file, or somewhere else that sticks around, but for these LTO object files that need to be upgraded, we run into this. Tossing all these LTO file directives in a StringSaver will use up a bit of memory for sure, but perhaps it's tolerable here?

MaskRay commented 9 months ago

The reason is that the export directives are stored as StringRef pointing at the source memory. In the case of LTO objects that have been upgraded, this is memory owned by SmallVector<char, 0> Strtab here: llvmorg-17.0.6/llvm/include/llvm/LTO/LTO.h#L119-L125 After compiling LTO, this object is destructed, and the StringRef is left with a dangling pointer. For LTO objects that didn't need to be upgraded (see llvmorg-17.0.6/llvm/lib/Object/IRSymtab.cpp#L404-L438), the StringRef seems to point into memory elsewhere (maybe part of a memory map?) which isn't destructed at that time.

It seems that InputFile::~InputFile is called at checkError(ltoObj->add(std::move(f.obj), resols));checkError(ltoObj->add(std::move(f.obj), resols));.

It would be sufficient to e.g. wrap that in a StringSaver::save() operation

Yes, using StringSaver::save() seems reasonable.

mstorsjo commented 9 months ago

Fixed in https://github.com/llvm/llvm-project/commit/d0986519d58e6d71656019cfa6604efa4bf6d3e7.

llvm / llvm-project

[LLD] Use-after-free with COFF directives from LTO objects #78591