facebookarchive / BOLT

Binary Optimization and Layout Tool - A linux command-line utility used for optimizing performance of binaries
2.51k stars 177 forks source link

Assert error in resolveAArch64Relocation #295

Closed HShan886 closed 1 year ago

HShan886 commented 2 years ago

when using latest bolt, I receive a assert error, such as:

#9 0x000000000276e181 llvm::RuntimeDyldELF::resolveAArch64Relocation(llvm::SectionEntry const&, unsigned long, unsigned long, unsigned int, long) /tmp/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:493:48
#10 0x00000000027705b4 llvm::RuntimeDyldELF::resolveRelocation(llvm::SectionEntry const&, unsigned long, unsigned long, unsigned int, long, unsigned long, unsigned int) /tmp/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:1052:5
#11 0x00000000027704cc llvm::RuntimeDyldELF::resolveRelocation(llvm::RelocationEntry const&, unsigned long) /tmp/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:1033:27
#12 0x0000000002750b96 llvm::RuntimeDyldImpl::resolveRelocationList(llvm::SmallVector<llvm::RelocationEntry, 64u> const&, unsigned long) /tmp/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:1112:22
#13 0x00000000027510db llvm::RuntimeDyldImpl::applyExternalSymbolRelocations(llvm::StringMap<llvm::JITEvaluatedSymbol, llvm::MallocAllocator>) /tmp/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:1118:24
#14 0x000000000275164d llvm::RuntimeDyldImpl::resolveExternalSymbols() /tmp/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:1218:33
#15 0x000000000274ba07 llvm::RuntimeDyldImpl::resolveRelocations() /tmp/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:133:42
#16 0x000000000275281e llvm::RuntimeDyld::resolveRelocations() /tmp/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:1403:70
#17 0x0000000002752909 llvm::RuntimeDyld::finalizeWithMemoryManagerLocking() /tmp/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:1422:19
#18 0x00000000011483e5 llvm::bolt::RewriteInstance::emitAndLink() /tmp/llvm-project/bolt/lib/Rewrite/RewriteInstance.cpp:3151:23

the main reason is that bolt resolves symbols _ZNSs4_Rep10_M_destroyERKSaIcE@PLT in .plt section of binary. The caller function rewrite into .text.cold section, but the callee _ZNSs4_Rep10_M_destroyERKSaIcE@PLT is in .plt section. Therefore the Offset between caller and callee is out of range int<26>.

More information: RelocateType: R_AARCH64_JUMP26/R_AARCH64_CALL26 and BOLT's new section address information is BOLT-INFO: creating new program header table at address 0x8a00000, offset 0x8600000

Any suggestion will be pleasure.

yota9 commented 2 years ago

Hello! What BOLT version do you use? Using the latest bolt with this commit (https://github.com/llvm/llvm-project/commit/4c14519ecbba54870553611ed34dbec596e1f7e7) should fix the problem

HShan886 commented 2 years ago

@yota9 thank you for replying. I am using the latest bolt, and this commit has been merged into the latest bolt. The commit doesn't work.

yota9 commented 2 years ago

@Haishan312 May I see the binary or at least function objdump + readelf section/segments output? Probably the binary would be preferable way, since I need to see exactly what is going on there

HShan886 commented 2 years ago

@yota9 You can download the binary from this repo git clone https://github.com/Haishan312/bolt-fixbugs.git. Because of github issue limit the size of attach file.

yota9 commented 2 years ago

@Haishan312 Yes, I was able to download it, thanks. Also I need your bolt options (and might be fdata file) to reproduce the issue

HShan886 commented 2 years ago

@yota9 bolt options is llvm-bolt ${binary_name} -o ${binary_name}.bolt -data=perf.fdata -reorder-blocks=cache+ -reorder-functions=hfsort -split-functions=2 -split-all-cold -split-eh -dyno-stats --update-debug-sections -v=2

and the fdata file 502.fdata.zip

yota9 commented 2 years ago

@Haishan312 Thanks, I was able to repro the issue, it is not related to plt, but I will try to resolve it.

HShan886 commented 2 years ago

@yota9 that's great, thank you. I have another case which is related to plt. I am recovering a small binary

yota9 commented 2 years ago

Hello @Haishan312 ! Sorry for long reply. I've checked out your binary, the problem is known to me and I've raised it earlier with goland support. Nevertheless I've found the way to fix it easily with this commit https://github.com/llvm/llvm-project/commit/fd9604952d80d62bc3db57fff07c047bb6773903 , please try to rebase the bolt on the latest commits and check if there are any other problems. Thank you!

HShan886 commented 2 years ago

@yota9 Thank you very much. This commit resolves my problem.

yota9 commented 2 years ago

@Haishan312 Great! Please close the issue if everything is solved :)

HShan886 commented 2 years ago

@yota9 Your patch would generate an illegal instruction on aarch64 platform. when I use llvm-objdump tools to dump assembly. I get the an unknown instruction, such as: ae3254: 60 d9 72 b8 ldr w0, [x11, w18, sxtw #2] ae3258: 00 b2 fc 7f unknown ae325c: e0 03 11 aa mov x0, x17

And you can download original attach files. illegel_instr.tar.gz

yota9 commented 2 years ago

Hello @Haishan312 ! I don't think the patch has anything to do with that. For some reason tbz instruction was not relaxed, I would check the reason soon. Thank you for reporting!

yota9 commented 2 years ago

@Haishan312 The problem was found in another part of LLVM https://reviews.llvm.org/D128740 . It may take some time before approve would be received, so fill free to apply this patch manually :) UPD The patch has landed.

aaupov commented 1 year ago

Assuming the issue is resolved with rGb27d6ffe4e4a.