systems-nuts / unifico

Compiler and build harness for heterogeneous-ISA binaries with the same stack layout.
4 stars 1 forks source link

Compilation error with `-O2` flag in `cg` and `ft` #209

Open blackgeorge-boom opened 2 years ago

blackgeorge-boom commented 2 years ago

Probably related to the callsite alignment.

E.g., the optimizations eliminate some of the call sites, leading to an uneven number between the binaries. We need to make the LLVM more robust to these kinds of cases.

blackgeorge-boom commented 2 years ago

Actually, it is another kind of error (probably more tricky to solve):

~/llvm-9/toolchain/bin/llc -function-sections -data-sections -relocation-model=pic --trap-unreachable -optimize-regalloc -fast-isel=false -disable-machine-cse -disable-block-align --mc-relax-all -disable-x86-frame-obj-order -aarch64-csr-alignment=8 -align-bytes-to-four -reg-scavenging-slot -enable-misched=false -mattr=+disable-hoist-in-lowering,+disable-fp-imm-materialize,-avoid-f128,+avoid-wide-mul-add -march=aarch64 -filetype=obj -o cg_aarch64_init.o cg_opt.ll
error: ran out of registers during register allocation
error: ran out of registers during register allocation
warning: cg.c:191:3: (x86_64-unknown-linux-gnu) Stack transformation: unhandled register R15B across call to puts
warning: cg.c:192:3: (x86_64-unknown-linux-gnu) Stack transformation: unhandled register R15B across call to printf
warning: cg.c:193:3: (x86_64-unknown-linux-gnu) Stack transformation: unhandled register R15B across call to printf
warning: cg.c:194:3: (x86_64-unknown-linux-gnu) Stack transformation: unhandled register R15B across call to putchar
Stack dump:
0.      Program arguments: /home/blackgeorge/llvm-9/toolchain/bin/llc -function-sections -data-sections -relocation-model=pic --trap-unreachable -optimize-regalloc -fast-isel=false -disable-machine-cse -disable-block-align --mc-relax-all -disable-x86-frame-obj-order -aarch64-csr-alignment=8 -align-bytes-to-four -reg-scavenging-slot -enable-misched=false -mattr=+disable-hoist-in-lowering,+disable-fp-imm-materialize,-avoid-f128,+avoid-wide-mul-add -march=aarch64 -filetype=obj -o cg_aarch64_init.o cg_opt.ll 
1.      Running pass 'Function Pass Manager' on module 'cg_opt.ll'.
2.      Running pass 'AArch64 Assembly Printer' on function '@main'
make: *** [../../common/common.mk:296: cg_x86_64_init.o] Error 1
make: *** Waiting for unfinished jobs....
 #0 0x000055c9752856cd llvm::sys::PrintStackTrace(llvm::raw_ostream&) /home/blackgeorge/llvm-project/llvm/lib/Support/Unix/Signals.inc:533:0
 #1 0x000055c975285760 PrintStackTraceSignalHandler(void*) /home/blackgeorge/llvm-project/llvm/lib/Support/Unix/Signals.inc:594:0
 #2 0x000055c97528363a llvm::sys::RunSignalHandlers() /home/blackgeorge/llvm-project/llvm/lib/Support/Signals.cpp:68:0
 #3 0x000055c975285084 SignalHandler(int) /home/blackgeorge/llvm-project/llvm/lib/Support/Unix/Signals.inc:385:0
 #4 0x00007f04cf10c540 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x15540)
 #5 0x000055c973817d2a llvm::ilist_node_base<true>::getNext() const /home/blackgeorge/llvm-project/llvm/include/llvm/ADT/ilist_node_base.h:43:0
 #6 0x000055c973c0ae6c llvm::ilist_node_impl<llvm::ilist_detail::node_options<llvm::Instruction, true, false, void> >::getNext() const /home/blackgeorge/llvm-project/llvm/include/llvm/ADT/ilist_node.h:75:0
 #7 0x000055c973c0a6e3 llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::Instruction, true, false, void>, false, true>::operator++() /home/blackgeorge/llvm-project/llvm/include/llvm/ADT/ilist_iterator.h:158:0
 #8 0x000055c973f15220 llvm::simple_ilist<llvm::Instruction>::begin() const /home/blackgeorge/llvm-project/llvm/include/llvm/ADT/simple_ilist.h:118:0
 #9 0x000055c973f34306 llvm::BasicBlock::begin() const /home/blackgeorge/llvm-project/llvm/include/llvm/IR/BasicBlock.h:269:0
#10 0x000055c974726aa5 llvm::StackMaps::recordPcnStackMapOpers(llvm::MachineInstr const&, unsigned long, llvm::MachineOperand const*, llvm::MachineOperand const*, bool) /home/blackgeorge/llvm-project/llvm/lib/CodeGen/StackMaps.cpp:775:0
#11 0x000055c974727b24 llvm::StackMaps::recordPcnStackMap(llvm::MachineInstr const&) /home/blackgeorge/llvm-project/llvm/lib/CodeGen/StackMaps.cpp:942:0
#12 0x000055c973c3c50a (anonymous namespace)::AArch64AsmPrinter::LowerPCN_STACKMAP(llvm::MCStreamer&, llvm::StackMaps&, llvm::MachineInstr const&) /home/blackgeorge/llvm-project/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp:810:0
#13 0x000055c973c3e378 (anonymous namespace)::AArch64AsmPrinter::EmitInstruction(llvm::MachineInstr const*) /home/blackgeorge/llvm-project/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp:1114:0
#14 0x000055c9741f1ecb llvm::AsmPrinter::EmitFunctionBody() /home/blackgeorge/llvm-project/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp:1108:0
#15 0x000055c973c38fa7 (anonymous namespace)::AArch64AsmPrinter::runOnMachineFunction(llvm::MachineFunction&) /home/blackgeorge/llvm-project/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp:152:0
#16 0x000055c9745044a1 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /home/blackgeorge/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:73:0
#17 0x000055c974a0e186 llvm::FPPassManager::runOnFunction(llvm::Function&) /home/blackgeorge/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1648:0
#18 0x000055c974a0e477 llvm::FPPassManager::runOnModule(llvm::Module&) /home/blackgeorge/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1685:0
#19 0x000055c974a0e8b7 (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) /home/blackgeorge/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1750:0
#20 0x000055c974a0f077 llvm::legacy::PassManagerImpl::run(llvm::Module&) /home/blackgeorge/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1863:0
#21 0x000055c974a0f269 llvm::legacy::PassManager::run(llvm::Module&) /home/blackgeorge/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1895:0
#22 0x000055c97381221d compileModule(char**, llvm::LLVMContext&) /home/blackgeorge/llvm-project/llvm/tools/llc/llc.cpp:630:0
#23 0x000055c97381083e main /home/blackgeorge/llvm-project/llvm/tools/llc/llc.cpp:376:0
#24 0x00007f04ceb821e3 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x271e3)
#25 0x000055c97380e9ce _start (/home/blackgeorge/llvm-9/toolchain/bin/llc+0x11c09ce)
Segmentation fault (core dumped)
blackgeorge-boom commented 2 years ago

It seems that the offending instruction is a stackmap instruction:

call void (i64, i32, ...) @llvm.experimental.pcn.stackmap(i64 17, i32 0, i1 %.b411, i32* %arrayidx115.i.i, i32* %arrayidx119.i.i, i32* %arrayidx172.i.i, i32* %arrayidx79.i.i, double %call72.i.i, i1 %cmp108.i.i, i64 %idxprom114.i.i, i64 %idxprom118.i.i, i64 %indvars.iv88.i.i, i64 %indvars.iv90.i.i, i64 %indvars.iv92.i.i, [8 x i32]* %ivc.i, i32 %k.1.lcssa.i.i, double %mul.i.i, double* %rnorm, double %size.037.i.i, %struct.timeval* %tv122, %struct.timeval* %tv199, %struct.timeval* %tv281, double %va.0.i.i, [8 x double]* %vc.i, i64 %wide.trip.count104.i.i, i64 %wide.trip.count119.i.i430505, i64 %wide.trip.count94.i.i, i8* %0, i8* %6, i8* %7, i32 %60, i32 %65)

which, in the -O2 version, seems to have a lot of arguments.

This might cause a problem, as described here:

https://github.com/llvm/llvm-project/issues/56880

compor commented 2 years ago

Great digging. Maybe consider applying an exclusion for that opcode as they've done in the patch mentioned in that issue?

blackgeorge-boom commented 2 years ago

It seems like a simple opcode comparison and

That is a real cliffhanger.

blackgeorge-boom commented 2 years ago

Great digging. Maybe consider applying an exclusion for that opcode as they've done in the patch mentioned in that issue?

Indeed, that would be a good solution. Not sure how easy it is for stackmaps, since it was not fixed yet by those who first discovered the issue.

If we want to skip this for now and just want to get measurements with -O2, a quick fix is to not include the stackmaps. This way, we don't know if the stack is the same, but we can measure the binary's execution time.

compor commented 2 years ago

LMAO, just brain 💨

Great digging. Maybe consider applying an exclusion for that opcode as they've done in the patch mentioned in that issue?

Indeed, that would be a good solution. Not sure how easy it is for stackmaps, since it was not fixed yet by those who first discovered the issue.

If we want to skip this for now and just want to get measurements with -O2, a quick fix is to not include the stackmaps. This way, we don't know if the stack is the same, but we can measure the binary's execution time.

compor commented 2 years ago

Great digging. Maybe consider applying an exclusion for that opcode as they've done in the patch mentioned in that issue?

Indeed, that would be a good solution. Not sure how easy it is for stackmaps, since it was not fixed yet by those who first discovered the issue.

If we want to skip this for now and just want to get measurements with -O2, a quick fix is to not include the stackmaps. This way, we don't know if the stack is the same, but we can measure the binary's execution time.

No objection with that for now.