llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.84k stars 11.47k forks source link

[NVPTX] Global of unaligned vector types cause assertion errors and buffer overflows #59179

Open HazyFish opened 1 year ago

HazyFish commented 1 year ago

Description

When targeting nvptx or nvptx64, the following code containing v16i20 global crashes backend with

The problem does not exist when targeting x86_64, aarch64, riscv64, or wasm64.

Minimal Reproduction

https://godbolt.org/z/jnaqz5qq3

Code

@G = global <16 x i20> <i20 7, i20 7, i20 7, i20 7, i20 7, i20 7, i20 7, i20 7, i20 7, i20 7, i20 7, i20 7, i20 7, i20 7, i20 7, i20 7>

define void @f(<16 x i64>* %0) {
BB:
  %LGV = load <16 x i20>, <16 x i20>* @G
  %C = sext <16 x i20> %LGV to <16 x i64>
  store <16 x i64> %C, <16 x i64>* %0
  ret void
}

Stack Trace (Debug)

llc: /home/henry/aflplusplus-isel/llvm-project-latest/llvm/lib/Support/APInt.cpp:484: uint64_t llvm::APInt::extractBitsAsZExtValue(unsigned int, unsigned int) const: Assertion `bitPosition < BitWidth && (numBits + bitPosition) <= BitWidth && "Illegal bit extraction"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.  Program arguments: ./llvm-project-latest/build-debug/bin/llc -mtriple=nvptx64 ./crash-reports/dagisel-nvptx64/1.ll
1.  Running pass 'Function Pass Manager' on module './crash-reports/dagisel-nvptx64/1.ll'.
2.  Running pass 'NVPTX Assembly Printer' on function '@f'
 #0 0x00007f8fabb02cfa llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/henry/aflplusplus-isel/llvm-project-latest/llvm/lib/Support/Unix/Signals.inc:565:11
 #1 0x00007f8fabb02eab PrintStackTraceSignalHandler(void*) /home/henry/aflplusplus-isel/llvm-project-latest/llvm/lib/Support/Unix/Signals.inc:632:1
 #2 0x00007f8fabb01506 llvm::sys::RunSignalHandlers() /home/henry/aflplusplus-isel/llvm-project-latest/llvm/lib/Support/Signals.cpp:104:5
 #3 0x00007f8fabb035d5 SignalHandler(int) /home/henry/aflplusplus-isel/llvm-project-latest/llvm/lib/Support/Unix/Signals.inc:407:1
 #4 0x00007f8faa2c0980 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12980)
 #5 0x00007f8fa95bce87 raise /build/glibc-CVJwZb/glibc-2.27/signal/../sysdeps/unix/sysv/linux/raise.c:51:0
 #6 0x00007f8fa95be7f1 abort /build/glibc-CVJwZb/glibc-2.27/stdlib/abort.c:81:0
 #7 0x00007f8fa95ae3fa __assert_fail_base /build/glibc-CVJwZb/glibc-2.27/assert/assert.c:89:0
 #8 0x00007f8fa95ae472 (/lib/x86_64-linux-gnu/libc.so.6+0x30472)
 #9 0x00007f8fab8f7312 llvm::APInt::extractBitsAsZExtValue(unsigned int, unsigned int) const /home/henry/aflplusplus-isel/llvm-project-latest/llvm/lib/Support/APInt.cpp:485:3
#10 0x00007f8fba86555a llvm::NVPTXAsmPrinter::bufferLEByte(llvm::Constant const*, int, llvm::NVPTXAsmPrinter::AggBuffer*)::$_1::operator()(llvm::APInt const&) const /home/henry/aflplusplus-isel/llvm-project-latest/llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp:1821:20
#11 0x00007f8fba865263 llvm::NVPTXAsmPrinter::bufferLEByte(llvm::Constant const*, int, llvm::NVPTXAsmPrinter::AggBuffer*) /home/henry/aflplusplus-isel/llvm-project-latest/llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp:1830:7
#12 0x00007f8fba8639c3 llvm::NVPTXAsmPrinter::bufferAggregateConstant(llvm::Constant const*, llvm::NVPTXAsmPrinter::AggBuffer*) /home/henry/aflplusplus-isel/llvm-project-latest/llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp:1904:63
#13 0x00007f8fba8628c2 llvm::NVPTXAsmPrinter::printModuleLevelGV(llvm::GlobalVariable const*, llvm::raw_ostream&, bool, llvm::NVPTXSubtarget const&) /home/henry/aflplusplus-isel/llvm-project-latest/llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp:1178:25
#14 0x00007f8fba85ed19 llvm::NVPTXAsmPrinter::emitGlobals(llvm::Module const&) /home/henry/aflplusplus-isel/llvm-project-latest/llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp:834:52
#15 0x00007f8fba85e807 llvm::NVPTXAsmPrinter::emitFunctionEntryLabel() /home/henry/aflplusplus-isel/llvm-project-latest/llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp:453:5
#16 0x00007f8fb009d9a5 llvm::AsmPrinter::emitFunctionHeader() /home/henry/aflplusplus-isel/llvm-project-latest/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp:983:26
#17 0x00007f8fb009fd73 llvm::AsmPrinter::emitFunctionBody() /home/henry/aflplusplus-isel/llvm-project-latest/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp:1551:3
#18 0x00007f8fba867b4e llvm::AsmPrinter::runOnMachineFunction(llvm::MachineFunction&) /home/henry/aflplusplus-isel/llvm-project-latest/llvm/include/llvm/CodeGen/AsmPrinter.h:390:5
#19 0x00007f8fba85f761 llvm::NVPTXAsmPrinter::runOnMachineFunction(llvm::MachineFunction&) /home/henry/aflplusplus-isel/llvm-project-latest/llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp:487:29
#20 0x00007f8faf288ff5 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /home/henry/aflplusplus-isel/llvm-project-latest/llvm/lib/CodeGen/MachineFunctionPass.cpp:91:8
#21 0x00007f8fae441cc6 llvm::FPPassManager::runOnFunction(llvm::Function&) /home/henry/aflplusplus-isel/llvm-project-latest/llvm/lib/IR/LegacyPassManager.cpp:1430:23
#22 0x00007f8fae446af2 llvm::FPPassManager::runOnModule(llvm::Module&) /home/henry/aflplusplus-isel/llvm-project-latest/llvm/lib/IR/LegacyPassManager.cpp:1476:16
#23 0x00007f8fae442599 (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) /home/henry/aflplusplus-isel/llvm-project-latest/llvm/lib/IR/LegacyPassManager.cpp:1545:23
#24 0x00007f8fae44210d llvm::legacy::PassManagerImpl::run(llvm::Module&) /home/henry/aflplusplus-isel/llvm-project-latest/llvm/lib/IR/LegacyPassManager.cpp:535:16
#25 0x00007f8fae446dd1 llvm::legacy::PassManager::run(llvm::Module&) /home/henry/aflplusplus-isel/llvm-project-latest/llvm/lib/IR/LegacyPassManager.cpp:1672:3
#26 0x000000000041963c compileModule(char**, llvm::LLVMContext&) /home/henry/aflplusplus-isel/llvm-project-latest/llvm/tools/llc/llc.cpp:736:41
#27 0x00000000004179e2 main /home/henry/aflplusplus-isel/llvm-project-latest/llvm/tools/llc/llc.cpp:417:13
#28 0x00007f8fa959fc87 __libc_start_main /build/glibc-CVJwZb/glibc-2.27/csu/../csu/libc-start.c:344:0
#29 0x00000000004171ea _start (./llvm-project-latest/build-debug/bin/llc+0x4171ea)

Stack Trace (Release)

corrupted size vs. prev_size
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.  Program arguments: ./llvm-project-latest/build-release/bin/llc -mtriple=nvptx64 ./crash-reports/dagisel-nvptx64/1.ll
 #0 0x00007f3e4c972a93 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/henry/aflplusplus-isel/llvm-project-latest/build-release/bin/../lib/libLLVMSupport.so.16git+0x1b9a93)
 #1 0x00007f3e4c9709be llvm::sys::RunSignalHandlers() (/home/henry/aflplusplus-isel/llvm-project-latest/build-release/bin/../lib/libLLVMSupport.so.16git+0x1b79be)
 #2 0x00007f3e4c972f2f SignalHandler(int) Signals.cpp:0:0
 #3 0x00007f3e4bc15980 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12980)
 #4 0x00007f3e4af11e87 raise /build/glibc-CVJwZb/glibc-2.27/signal/../sysdeps/unix/sysv/linux/raise.c:51:0
 #5 0x00007f3e4af137f1 abort /build/glibc-CVJwZb/glibc-2.27/stdlib/abort.c:81:0
 #6 0x00007f3e4af5c837 __libc_message /build/glibc-CVJwZb/glibc-2.27/libio/../sysdeps/posix/libc_fatal.c:181:0
 #7 0x00007f3e4af638ba /build/glibc-CVJwZb/glibc-2.27/malloc/malloc.c:5342:0
 #8 0x00007f3e4af63abc malloc_consolidate /build/glibc-CVJwZb/glibc-2.27/malloc/malloc.c:4486:0
 #9 0x00007f3e4af67848 _int_malloc /build/glibc-CVJwZb/glibc-2.27/malloc/malloc.c:3713:0
#10 0x00007f3e4af6a0ac malloc /build/glibc-CVJwZb/glibc-2.27/malloc/malloc.c:3068:0
#11 0x00007f3e4b90d298 operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libstdc++.so.6+0x93298)
#12 0x00007f3e4c89f940 llvm::formatted_raw_ostream::releaseStream() (/home/henry/aflplusplus-isel/llvm-project-latest/build-release/bin/../lib/libLLVMSupport.so.16git+0xe6940)
#13 0x00007f3e4c89f75c llvm::formatted_raw_ostream::~formatted_raw_ostream() (/home/henry/aflplusplus-isel/llvm-project-latest/build-release/bin/../lib/libLLVMSupport.so.16git+0xe675c)
#14 0x00007f3e4da944d5 (anonymous namespace)::MCAsmStreamer::~MCAsmStreamer() MCAsmStreamer.cpp:0:0
#15 0x00007f3e4da944f9 (anonymous namespace)::MCAsmStreamer::~MCAsmStreamer() MCAsmStreamer.cpp:0:0
#16 0x00007f3e4ed1781e llvm::AsmPrinter::~AsmPrinter() (/home/henry/aflplusplus-isel/llvm-project-latest/build-release/bin/../lib/libLLVMAsmPrinter.so.16git+0x3081e)
#17 0x00007f3e573239f2 llvm::NVPTXAsmPrinter::~NVPTXAsmPrinter() (/home/henry/aflplusplus-isel/llvm-project-latest/build-release/bin/../lib/libLLVMNVPTXCodeGen.so.16git+0x339f2)
#18 0x00007f3e4e14e328 llvm::FPPassManager::~FPPassManager() (/home/henry/aflplusplus-isel/llvm-project-latest/build-release/bin/../lib/libLLVMCore.so.16git+0x203328)
#19 0x00007f3e4e14f968 (anonymous namespace)::MPPassManager::~MPPassManager() LegacyPassManager.cpp:0:0
#20 0x00007f3e4e14ff70 non-virtual thunk to (anonymous namespace)::MPPassManager::~MPPassManager() LegacyPassManager.cpp:0:0
#21 0x00007f3e4e14abc8 llvm::PMTopLevelManager::~PMTopLevelManager() (/home/henry/aflplusplus-isel/llvm-project-latest/build-release/bin/../lib/libLLVMCore.so.16git+0x1ffbc8)
#22 0x00007f3e4e14ebb7 llvm::legacy::PassManagerImpl::~PassManagerImpl() (/home/henry/aflplusplus-isel/llvm-project-latest/build-release/bin/../lib/libLLVMCore.so.16git+0x203bb7)
#23 0x000000000040ea17 main (./llvm-project-latest/build-release/bin/llc+0x40ea17)
#24 0x00007f3e4aef4c87 __libc_start_main /build/glibc-CVJwZb/glibc-2.27/csu/../csu/libc-start.c:344:0
#25 0x00000000004092ea _start (./llvm-project-latest/build-release/bin/llc+0x4092ea)
HazyFish commented 1 year ago

cc @DataCorrupted

junaire commented 1 year ago

I think the problem is here (https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp#L1817-L1823):

size_t NumBytes = (Val.getBitWidth() + 7) / 8;
    SmallVector<unsigned char, 16> Buf(NumBytes);
    for (unsigned I = 0; I < NumBytes; ++I) {
      Buf[I] = Val.extractBitsAsZExtValue(8, I * 8);
    }

We calculate how many bytes the value contains (rounding up), so i20 got 3 bytes, then we try to zero extend that. However, since the last byte doesn't contain enough bits, so we got a crash.

I can work on this if someone can tell me what the expected behavior is :)

DataCorrupted commented 1 year ago

Thanks for the analysis. If that is correct root cause, you can fix it with one line of code. Just explicitly pad zeros to the head of the Val by zext it Val->zext(NumBytes << 3); before this for loop.

DataCorrupted commented 1 year ago

@Artem-B @junaire After some digging I found the root cause of the crash, but I need some help fixing it.

In short, the size of AggBuffer is aligned as a whole, while the element is not aligned. A <4 x i20> has 80 bits thus getting 10 bytes. However, when coping a constant into the buffer(shown below), we treat each element type as byte aligned, thus <4 x i20> would require 4 x 3 = 12bytes.

https://github.com/llvm/llvm-project/blob/8abd70081f761738e82b37b2891b60ad034f3880/llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp#L1822-L1829

This difference is the root cause of the problem. To fix this I can modify AggBuffer to allow insertion of bits, but I am not sure if you will feel like too hacky, as AggBuffer is also designed to be byte aligned.

junaire commented 1 year ago

@Artem-B @junaire After some digging I found the root cause of the crash, but I need some help fixing it.

In short, the size of AggBuffer is aligned as a whole, while the element is not aligned. A <4 x i20> has 80 bits thus getting 10 bytes. However, when coping a constant into the buffer(shown below), we treat each element type as byte aligned, thus <4 x i20> would require 4 x 3 = 12bytes.

https://github.com/llvm/llvm-project/blob/8abd70081f761738e82b37b2891b60ad034f3880/llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp#L1822-L1829

This difference is the root cause of the problem. To fix this I can modify AggBuffer to allow insertion of bits, but I am not sure if you will feel like too hacky, as AggBuffer is also designed to be byte aligned.

Hi, I actually tried this before (squash all bits into the AggBuffer without considering their alignment). Besides, I also tried to preallocate enough space for AggBuffer so every element and their alignment could fit. However, unfortunately, both strategies failed (verifier crashes, same backtrace above) :( I bet there's something different with the NVPTX target, so I'd like to know @Artem-B 's ideas...

Artem-B commented 1 year ago

Caveat: I didn't look into the details yet, will do once I'm back at work. The stuff below are my general thoughts on this.

Non-power-of-2 sized integers (let's call them PO2) are not handled particularly well in NVPTX in general. I vaguely recall that we promote scalars up to the next PO2 size, but vectors of such oddly-sized integers are likely to have problems, as you've discovered.

The problem does not exist when targeting x86_64, aarch64, riscv64, or wasm64

That's not a very strong argument.

What to do:

TBH, I'm not even quite sure what would be the right way to store such a vector in memory.

How does x64 deal with this? One of the implicit assumptions NVPTX users rely on in practice is that in-memory layout matches that of the host and the host sort of happens to be x64, even though these days NVIDIA GPUs are used on PPC and ARM as well. Making sure that we store such a vector the same way we do on x64 would probably make most sense.