facebookarchive / BOLT

Binary Optimization and Layout Tool - A linux command-line utility used for optimizing performance of binaries
2.51k stars 176 forks source link

Failed to match indirect branch when BOLT AArch64 binary #111

Open wade901 opened 3 years ago

wade901 commented 3 years ago

Hello,

When I'm trying to use BOLT to optimize an AArch64 application, BOLT aborted when analyze indirect branch fragments. The indirect branch code as follows. Has BOLT support all the patterns of AArch64 indirect branch already?

         652ec8:       900001e6        adrp    x6, 68e000 <plone+0x18>
         652ecc:       9121c0c6        add     x6, x6, #0x870
         652ed0:       8b0e08c6        add     x6, x6, x14, lsl #2
         652ed4:       b94000c7        ldr     w7, [x6]
         652ed8:       8b27c0c6        add     x6, x6, w7, sxtw
         652edc:       d61f00c0        br      x6
      4716dc:   b000171c    adrp    x28, 752000 <__func__.6103+0x120>
      ......
      4716e4:   91188380    add x0, x28, #0x620
      ......
      4716f0:   9e67000a    fmov    d10, x0
      ......
      4717b8:   9e660141    fmov    x1, d10
      4717bc:   78605820    ldrh    w0, [x1,w0,uxtw #1]
      4717c0:   10000061    adr x1, 4717cc <intr_thread_main+0x23c>
      4717c4:   8b20a820    add x0, x1, w0, sxth #2
      4717c8:   d61f0000    br  x0

Log:

BOLT-INFO: Target architecture: aarch64
BOLT-INFO: BOLT version: a4d81e618940629a402e03d9d4c351c0ca7c68fe
BOLT-INFO: first alloc address is 0x400000
BOLT-INFO: creating new program header table at address 0x800000, offset 0x400000
BOLT-WARNING: debug info will be stripped from the binary. Use -update-debug-sections to keep it.
BOLT-INFO: static input executable detected
BOLT-WARNING: non-relocation mode for AArch64 is not fully supported
BOLT-INFO: disabling -align-macro-fusion on non-x86 platform
BOLT-WARNING: disabling -split-eh in non-relocation mode
BOLT-INFO: pre-processing profile using branch profile reader
Failed to match indirect branch! (fragment 2)
UNREACHABLE executed at llvm/tools/llvm-bolt/src/Target/AArch64/AArch64MCPlusBuilder.cpp:507!
#0 0x00000000010afb0a llvm::sys::PrintStackTrace(llvm::raw_ostream&) (./build/bin/llvm-bolt+0x10afb0a)
#1 0x00000000010ad94e llvm::sys::RunSignalHandlers() (./build/bin/llvm-bolt+0x10ad94e)
#2 0x00000000010adbca SignalHandler(int) (./build/bin/llvm-bolt+0x10adbca)
#3 0x00007ffbd883a620 __restore_rt (/lib64/libpthread.so.0+0xf620)
#4 0x00007ffbd7d6b277 __GI_raise (/lib64/libc.so.6+0x36277)
#5 0x00007ffbd7d6c968 __GI_abort (/lib64/libc.so.6+0x37968)
#6 0x0000000001068f1a (./build/bin/llvm-bolt+0x1068f1a)
#7 0x00000000010c6f59 (anonymous namespace)::AArch64MCPlusBuilder::analyzeIndirectBranch(llvm::MCInst&, llvm::bolt::MCPlusBuilder::InstructionIterator, llvm::bolt::MCPlusBuilder::InstructionIterator, unsigned int, llvm::MCInst*&, unsigned int&, unsigned int&, long&, llvm::MCExpr const*&, llvm::MCInst*&) const (./build/bin/llvm-bolt+0x10c6f59)
#8 0x00000000004996dc llvm::bolt::BinaryFunction::processIndirectBranch(llvm::MCInst&, unsigned int, unsigned long, unsigned long&) (./build/bin/llvm-bolt+0x4996dc)
#9 0x00000000004a0109 llvm::bolt::BinaryFunction::disassemble() (./build/bin/llvm-bolt+0x4a0109)
#10 0x000000000052ca08 llvm::bolt::RewriteInstance::disassembleFunctions() (./build/bin/llvm-bolt+0x52ca08)
#11 0x0000000000585413 llvm::bolt::RewriteInstance::run() (./build/bin/llvm-bolt+0x585413)
#12 0x00000000004147ef main (./build/bin/llvm-bolt+0x4147ef)
#13 0x00007ffbd7d57445 __libc_start_main (/lib64/libc.so.6+0x22445)
#14 0x000000000045d16d _start (./build/bin/llvm-bolt+0x45d16d)
Stack dump:
0.  Program arguments: ./build/bin/llvm-bolt ./app -o ./app-bolt -data=./app.fdata -reorder-blocks=cache+ -split-functions=2 -split-all-cold -split-eh -dyno-stats
Aborted
rafaelauler commented 3 years ago

No, AArch64 BOLT doesn't support all possible indirect branch fragments. Our AArch64 port is experimental, but I see in your log this message:

BOLT-WARNING: non-relocation mode for AArch64 is not fully supported

Did you try a binary with relocations?

That said, the AArch64 port is far less mature than the X86 one and is also more challenging to support. Because of the RISC nature of it, there is less semantic encoded in each instruction and the compiler will frequently need to use multiple instructions to perform the same action that, in X86, would be a single instruction. This requires BOLT to match and recognize patterns spanning multiple instructions. That's tricky because the pattern may be spread in different basic blocks, which requires a dataflow analysis to track the relationship of the instructions across the entire CFG. But we can't build the complete CFG without understanding the indirect branches in the function. That's why it's a bit more involved to properly disassemble and reconstruct the CFG of a RISC binary. What we currently have implemented for AArch64 is a best effort strategy that worked well for gcc-generated binaries, but we can't guarantee will always work.

Yet, relocations for AArch64 are mandatory to properly recognize ADRP /ADR pairs. If you lack relocations, we won't understand that an ADRP instruction is trying to build the address of, say, ObjectX. We will have a partial view that ADRP is accessing the page of ObjectX and will create references against Page(ObjectX) instead of ObjectX. This will cause us to potentially link incorrectly depending on where Page(ObjectX) lands (in a different, unrelated section, for example).