facebookarchive / BOLT

Binary Optimization and Layout Tool - A linux command-line utility used for optimizing performance of binaries
2.51k stars 176 forks source link

Instrumentation crashes on Rust code #217

Closed lnicola closed 2 years ago

lnicola commented 2 years ago
BOLT-INFO: shared object or position-independent executable detected
BOLT-INFO: Target architecture: x86_64
BOLT-INFO: BOLT version: 0ae4e2d854c6d2435088cef682e967c59b15f628
BOLT-INFO: first alloc address is 0x0
BOLT-INFO: creating new program header table at address 0x1800000, offset 0x1800000
BOLT-WARNING: debug info will be stripped from the binary. Use -update-debug-sections to keep it.
BOLT-INFO: enabling relocation mode
BOLT-INFO: forcing -jump-tables=move for instrumentation
BOLT-INFO: enabling -align-macro-fusion=all since no profile was specified
BOLT-INFO: enabling lite mode
BOLT-WARNING: Failed to analyze 649 relocations
BOLT-WARNING: 73 collisions detected while hashing binary objects. Use -v=1 to see the list.
BOLT-INSTRUMENTER: Number of indirect call site descriptors: 135588
BOLT-INSTRUMENTER: Number of indirect call target descriptors: 36745
BOLT-INSTRUMENTER: Number of function descriptors: 36745
BOLT-INSTRUMENTER: Number of branch counters: 354311
BOLT-INSTRUMENTER: Number of ST leaf node counters: 234393
BOLT-INSTRUMENTER: Number of direct call counters: 145977
BOLT-INSTRUMENTER: Total number of counters: 734681
BOLT-INSTRUMENTER: Total size of counters: 5877448 bytes (static alloc memory)
BOLT-INSTRUMENTER: Total size of string table emitted: 5139036 bytes in file
BOLT-INSTRUMENTER: Total size of descriptors: 36112068 bytes in file
BOLT-INSTRUMENTER: Profile will be saved to file /tmp/prof.fdata
BOLT-INFO: 0 out of 39961 functions in the binary (0.0%) have non-empty execution profile
BOLT-INFO: the input contains 3793 (dynamic count : 0) opportunities for macro-fusion optimization that are going to be fixed
BOLT-INFO: UCE removed 1401 blocks and 19965 bytes of code.
BOLT-INFO: SCTC: patched 0 tail calls (0 forward) tail calls (0 backward) from a total of 0 while removing 0 double jumps and removing 0 basic blocks totalling 0 bytes of code. CTCs total execution count is 0 and the number of times CTCs are taken is 0.
 #0 0x000055e86565c5e1 PrintStackTraceSignalHandler(void*) Signals.cpp:0:0
 #1 0x000055e86565a1ae SignalHandler(int) Signals.cpp:0:0
 #2 0x00007ff69afcf870 __restore_rt sigaction.c:0:0
 #3 0x000055e865cf4246 llvm::RuntimeDyld::getSymbol(llvm::StringRef) const (~/build/bin/llvm-bolt+0x153d246)
 #4 0x000055e864a70141 (anonymous namespace)::BOLTSymbolResolver::lookup(std::set<llvm::StringRef, std::less<llvm::StringRef>, std::allocator<llvm::StringRef> > const&, llvm::unique_function<void (llvm::Expected<std::map<llvm::StringRef, llvm::JITEvaluatedSymbol, std::less<llvm::StringRef>, std::allocator<std::pair<llvm::StringRef const, llvm::JITEvaluatedSymbol> > > >)>) RewriteInstance.cpp:0:0
 #5 0x000055e865cf519f llvm::RuntimeDyldImpl::resolveExternalSymbols() (~/build/bin/llvm-bolt+0x153e19f)
 #6 0x000055e865cf947a llvm::RuntimeDyldImpl::resolveRelocations() (~/build/bin/llvm-bolt+0x154247a)
 #7 0x000055e865cf97ed llvm::RuntimeDyld::finalizeWithMemoryManagerLocking() (~/build/bin/llvm-bolt+0x15427ed)
 #8 0x000055e864ec8990 llvm::bolt::InstrumentationRuntimeLibrary::link(llvm::bolt::BinaryContext&, llvm::StringRef, llvm::RuntimeDyld&, std::function<void (llvm::RuntimeDyld&)>) (~/build/bin/llvm-bolt+0x711990)
 #9 0x000055e864a9c031 llvm::bolt::RewriteInstance::emitAndLink() (~/build/bin/llvm-bolt+0x2e5031)
#10 0x000055e864a9fa23 llvm::bolt::RewriteInstance::run() (~/build/bin/llvm-bolt+0x2e8a23)
#11 0x000055e86490a661 main (~/build/bin/llvm-bolt+0x153661)
#12 0x00007ff69aa04b25 __libc_start_main (/usr/lib/libc.so.6+0x27b25)
#13 0x000055e86496773e _start (~/build/bin/llvm-bolt+0x1b073e)
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
BOLT-INFO: shared object or position-independent executable detected
BOLT-INFO: Target architecture: x86_64
BOLT-INFO: BOLT version: 0ae4e2d854c6d2435088cef682e967c59b15f628
BOLT-INFO: first alloc address is 0x0
BOLT-INFO: creating new program header table at address 0x200000, offset 0x200000
BOLT-WARNING: debug info will be stripped from the binary. Use -update-debug-sections to keep it.
BOLT-INFO: enabling relocation mode
BOLT-INFO: forcing -jump-tables=move for instrumentation
BOLT-INFO: enabling -align-macro-fusion=all since no profile was specified
BOLT-INFO: enabling lite mode
BOLT-WARNING: Failed to analyze 63 relocations
BOLT-INSTRUMENTER: Number of indirect call site descriptors: 1209
BOLT-INSTRUMENTER: Number of indirect call target descriptors: 392
BOLT-INSTRUMENTER: Number of function descriptors: 392
BOLT-INSTRUMENTER: Number of branch counters: 4379
BOLT-INSTRUMENTER: Number of ST leaf node counters: 2664
BOLT-INSTRUMENTER: Number of direct call counters: 367
BOLT-INSTRUMENTER: Total number of counters: 7410
BOLT-INSTRUMENTER: Total size of counters: 59280 bytes (static alloc memory)
BOLT-INSTRUMENTER: Total size of string table emitted: 35672 bytes in file
BOLT-INSTRUMENTER: Total size of descriptors: 399960 bytes in file
BOLT-INSTRUMENTER: Profile will be saved to file /tmp/prof.fdata
BOLT-INFO: 0 out of 431 functions in the binary (0.0%) have non-empty execution profile
BOLT-INFO: the input contains 66 (dynamic count : 0) opportunities for macro-fusion optimization that are going to be fixed
BOLT-INFO: UCE removed 4 blocks and 69 bytes of code.
BOLT-INFO: SCTC: patched 0 tail calls (0 forward) tail calls (0 backward) from a total of 0 while removing 0 double jumps and removing 0 basic blocks totalling 0 bytes of code. CTCs total execution count is 0 and the number of times CTCs are taken is 0.
#0 0x00005563bbd0b5e1 PrintStackTraceSignalHandler(void*) Signals.cpp:0:0
#1 0x00005563bbd091ae SignalHandler(int) Signals.cpp:0:0
#2 0x00007fcd0d564870 __restore_rt sigaction.c:0:0
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
yavtuk commented 2 years ago

Hi @lnicola can you share the instrumentation options? I used the latest version (BOLT version: d6bb83533eb2a147b407d7afe96114a7c959ad6f) and the instrumentation works for the simple hello world application.

lnicola commented 2 years ago

No options:

~/build/bin/llvm-bolt target/release/hello -instrument -o hello.intrumented

Tested at 0ae4e2d854c6d2435088cef682e967c59b15f628.

I'll try to update and test again in the next few days.

lnicola commented 2 years ago

Still crashes for me:

$ ~/build/bin/llvm-bolt target/release/hello -instrument -o hello.intrumented
BOLT-INFO: shared object or position-independent executable detected
BOLT-INFO: Target architecture: x86_64
BOLT-INFO: BOLT version: 905321291962bb06aae54ab300f48ce506b62bef
BOLT-INFO: first alloc address is 0x0
BOLT-INFO: creating new program header table at address 0x200000, offset 0x200000
BOLT-WARNING: debug info will be stripped from the binary. Use -update-debug-sections to keep it.
BOLT-INFO: enabling relocation mode
BOLT-INFO: forcing -jump-tables=move for instrumentation
BOLT-INFO: enabling -align-macro-fusion=all since no profile was specified
BOLT-INFO: enabling lite mode
BOLT-WARNING: Failed to analyze 92 relocations
BOLT-INSTRUMENTER: Number of indirect call site descriptors: 1541
BOLT-INSTRUMENTER: Number of indirect call target descriptors: 472
BOLT-INSTRUMENTER: Number of function descriptors: 472
BOLT-INSTRUMENTER: Number of branch counters: 4940
BOLT-INSTRUMENTER: Number of ST leaf node counters: 3015
BOLT-INSTRUMENTER: Number of direct call counters: 486
BOLT-INSTRUMENTER: Total number of counters: 8441
BOLT-INSTRUMENTER: Total size of counters: 67528 bytes (static alloc memory)
BOLT-INSTRUMENTER: Total size of string table emitted: 42253 bytes in file
BOLT-INSTRUMENTER: Total size of descriptors: 461948 bytes in file
BOLT-INSTRUMENTER: Profile will be saved to file /tmp/prof.fdata
BOLT-INFO: 0 out of 515 functions in the binary (0.0%) have non-empty execution profile
BOLT-INFO: the input contains 74 (dynamic count : 0) opportunities for macro-fusion optimization that are going to be fixed
BOLT-INFO: UCE removed 19 blocks and 279 bytes of code.
BOLT-INFO: SCTC: patched 0 tail calls (0 forward) tail calls (0 backward) from a total of 0 while removing 0 double jumps and removing 0 basic blocks totalling 0 bytes of code. CTCs total execution count is 0 and the number of times CTCs are taken is 0.
#0 0x0000555a16fd25e1 PrintStackTraceSignalHandler(void*) Signals.cpp:0:0
#1 0x0000555a16fd01ae SignalHandler(int) Signals.cpp:0:0
#2 0x00007f7a2123b870 __restore_rt sigaction.c:0:0

I've attached the executable I'm testing with.

hello.zip

yavtuk commented 2 years ago

Hi @lnicola, let’s go step by step try to synchronize our actions.

My environment:

uname -a Linux astlxd604b01 5.10.0-1013-oem #14-Ubuntu SMP Mon Feb 1 10:59:04 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

rustc --version rustc 1.53.0

cat hello.rs fn main() { println!("Hello World!"); }

rustc hello.rs -v -C link-args=-Wl,-q

BOLT build process: cd ${BOLT_DIR_PATH} mkdir build cd build cmake -G Ninja ${BOLT_DIR_PATH}/llvm -DLLVM_TARGETS_TO_BUILD="X86" -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_ENABLE_PROJECTS="clang;lld;bolt" ninja -j 0

${BOLT_BUILD_PATH}/bin/llvm-bolt --version LLVM (http://llvm.org/): LLVM version 14.0.0git Optimized build with assertions. Default target: x86_64-unknown-linux-gnu Host CPU: cascadelake

BOLT revision 05535cefaa6f57f2fb6f6b732dbb8208f6a7b6bb Registered Targets: x86 - 32-bit X86: Pentium-Pro and above x86-64 - 64-bit X86: EM64T and AMD64

Instrumentation process: cd ~/example/rust-hello/ ${BOLT_BUILD_PATH}/bin/llvm-bolt ./hello -instrument -o ./hello.instr -instrumentation-file=hello.fdata -instrumentation-sleep-time=2 -instrumentation-no-counters-clear

BOLT-INFO: shared object or position-independent executable detected BOLT-INFO: Target architecture: x86_64 BOLT-INFO: BOLT version: 05535cefaa6f57f2fb6f6b732dbb8208f6a7b6bb BOLT-INFO: first alloc address is 0x0 BOLT-INFO: creating new program header table at address 0x200000, offset 0x200000 BOLT-WARNING: debug info will be stripped from the binary. Use -update-debug-sections to keep it. BOLT-INFO: enabling relocation mode BOLT-INFO: forcing -jump-tables=move for instrumentation BOLT-INFO: enabling -align-macro-fusion=all since no profile was specified BOLT-INFO: enabling lite mode BOLT-WARNING: Failed to analyze 46 relocations BOLT-INSTRUMENTER: Number of indirect call site descriptors: 1454 BOLT-INSTRUMENTER: Number of indirect call target descriptors: 501 BOLT-INSTRUMENTER: Number of function descriptors: 501 BOLT-INSTRUMENTER: Number of branch counters: 4743 BOLT-INSTRUMENTER: Number of ST leaf node counters: 3017 BOLT-INSTRUMENTER: Number of direct call counters: 537 BOLT-INSTRUMENTER: Total number of counters: 8297 BOLT-INSTRUMENTER: Total size of counters: 66376 bytes (static alloc memory) BOLT-INSTRUMENTER: Total size of string table emitted: 49405 bytes in file BOLT-INSTRUMENTER: Total size of descriptors: 449016 bytes in file BOLT-INSTRUMENTER: Profile will be saved to file hello.fdata BOLT-INFO: 0 out of 570 functions in the binary (0.0%) have non-empty execution profile BOLT-INFO: the input contains 59 (dynamic count : 0) opportunities for macro-fusion optimization that are going to be fixed BOLT-INFO: removed 1 'repz' prefixes with estimated execution count of 0 times. BOLT-INFO: UCE removed 3 blocks and 126 bytes of code. BOLT-INFO: SCTC: patched 0 tail calls (0 forward) tail calls (0 backward) from a total of 0 while removing 0 double jumps and removing 0 basic blocks totalling 0 bytes of code. CTCs total execution count is 0 and the number of times CTCs are taken is 0. BOLT-INFO: output linked against instrumentation runtime library, lib entry point is 0x4b45e0 BOLT-INFO: clear procedure is 0x4b33e0 BOLT-INFO: setting _end to 0x496728 BOLT-INFO: patched build-id (flipped last bit)

~/examples/rust-hello$./hello.instr Hello World! ~/examples/rust-hello$ ls hello hello.fdata hello.instr hello.rs

Also I checked your binary file and it is processed correctly. Are you doing the same way?

lnicola commented 2 years ago

Hmm, so the kernel and rustc version shouldn't matter, since you're testing with my binary, but I'm on 5.15.2 with rustc 1.57.0-beta.2 and I'm using cargo, not rustc to compile the test file.

I built BOLT using:

$ cmake -G Ninja ../llvm-bolt/llvm -DLLVM_TARGETS_TO_BUILD="X86" -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=OFF -DLLVM_ENABLE_PROJECTS="clang;lld;bolt"
$ cmake --build . --parallel 32

The big difference seems to be the LLVM assertions options, but it seems unlikely that disabling them would prevent a crash. I also tested that -instrumentation-file=hello.fdata -instrumentation-sleep-time=2 -instrumentation-no-counters-clear have no effect.

$ ~/build/bin/llvm-bolt --version
LLVM (http://llvm.org/):
  LLVM version 13.0.0git
  Optimized build.
  Default target: x86_64-unknown-linux-gnu
  Host CPU: znver3

BOLT revision 905321291962bb06aae54ab300f48ce506b62bef
  Registered Targets:
    x86    - 32-bit X86: Pentium-Pro and above
    x86-64 - 64-bit X86: EM64T and AMD64

Why do we have different LLVM versions? Did CMake pick my system LLVM, perhaps?

yavtuk commented 2 years ago

it's strange that your BOLT revision is 905321291962bb06aae54ab300f48ce506b62bef, can you check the build from the last commit?

lnicola commented 2 years ago

Oh, sorry, that was one documentation commit of mine applied on top of 0ae4e2d854c6d2435088cef682e967c59b15f628 (which is also gone from the history?). I fetched the latest code, but didn't check out the branch.

yavtuk commented 2 years ago

Are local tests passed successful? ninja check-bolt -j 8

lnicola commented 2 years ago

Still fails on a 05535cefaa6f57f2fb6f6b732dbb8208f6a7b6bb. I also tried make check-bolt:

[100%] Running the BOLT regression tests
llvm-lit: /home/grayshade/llvm-bolt/llvm/utils/lit/lit/llvm/config.py:436: note: using clang: /home/grayshade/build/bin/clang
llvm-lit: /home/grayshade/llvm-bolt/llvm/utils/lit/lit/llvm/config.py:436: note: using ld.lld: /home/grayshade/build/bin/ld.lld
llvm-lit: /home/grayshade/llvm-bolt/llvm/utils/lit/lit/llvm/config.py:436: note: using lld-link: /home/grayshade/build/bin/lld-link
llvm-lit: /home/grayshade/llvm-bolt/llvm/utils/lit/lit/llvm/config.py:436: note: using ld64.lld: /home/grayshade/build/bin/ld64.lld
llvm-lit: /home/grayshade/llvm-bolt/llvm/utils/lit/lit/llvm/config.py:436: note: using wasm-ld: /home/grayshade/build/bin/wasm-ld
FAIL: BOLT :: X86/asm-dump.c (92 of 105)
******************** TEST 'BOLT :: X86/asm-dump.c' FAILED ********************
Script:
--
: 'RUN: at line 7';   /home/grayshade/build/bin/clang -fPIC /home/grayshade/llvm-bolt/bolt/test/X86/asm-dump.c -o /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp.exe -Wl,-q
: 'RUN: at line 10';   /home/grayshade/build/bin/llvm-bolt /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp.exe -instrument -instrumentation-file=/home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp.fdata -o /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp.instr
: 'RUN: at line 13';   /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp.instr > /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp.result
: 'RUN: at line 16';   /home/grayshade/build/bin/llvm-bolt /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp.exe -p /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp.fdata -funcs=main -asm-dump=/home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp -o /dev/null    | /home/grayshade/build/bin/FileCheck /home/grayshade/llvm-bolt/bolt/test/X86/asm-dump.c --check-prefix=CHECK-BOLT
: 'RUN: at line 20';   cat /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp/main.s | /home/grayshade/build/bin/FileCheck /home/grayshade/llvm-bolt/bolt/test/X86/asm-dump.c --check-prefix=CHECK-FILE
: 'RUN: at line 24';   sed -i 's/\.L/L/g' /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp/main.s
: 'RUN: at line 27';   /home/grayshade/build/bin/llvm-mc -filetype=obj -triple x86_64-unknown-unknown /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp/main.s -o /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp.o
: 'RUN: at line 30';   /home/grayshade/llvm-bolt/bolt/test/link_fdata.py /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp/main.s /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp.o /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp.fdata.reconst
: 'RUN: at line 38';   /home/grayshade/build/bin/llvm-strip --strip-unneeded /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp.o
: 'RUN: at line 41';   /home/grayshade/build/bin/clang -fPIC /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp.o -o /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp.exe.reopt -Wl,-q
: 'RUN: at line 44';   /home/grayshade/build/bin/llvm-bolt /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp.exe.reopt -p /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp.fdata.reconst -o /dev/null    | /home/grayshade/build/bin/FileCheck /home/grayshade/llvm-bolt/bolt/test/X86/asm-dump.c --check-prefix=CHECK-REOPT
--
Exit Code: 139

Command Output (stdout):
--
BOLT-INFO: Target architecture: x86_64
BOLT-INFO: BOLT version: 05535cefaa6f57f2fb6f6b732dbb8208f6a7b6bb
BOLT-INFO: first alloc address is 0x400000
BOLT-INFO: creating new program header table at address 0x600000, offset 0x200000
BOLT-INFO: enabling relocation mode
BOLT-INFO: forcing -jump-tables=move for instrumentation
BOLT-INFO: enabling -align-macro-fusion=all since no profile was specified
BOLT-INFO: enabling lite mode
BOLT-INSTRUMENTER: Number of indirect call site descriptors: 3
BOLT-INSTRUMENTER: Number of indirect call target descriptors: 9
BOLT-INSTRUMENTER: Number of function descriptors: 9
BOLT-INSTRUMENTER: Number of branch counters: 7
BOLT-INSTRUMENTER: Number of ST leaf node counters: 13
BOLT-INSTRUMENTER: Number of direct call counters: 0
BOLT-INSTRUMENTER: Total number of counters: 20
BOLT-INSTRUMENTER: Total size of counters: 160 bytes (static alloc memory)
BOLT-INSTRUMENTER: Total size of string table emitted: 183 bytes in file
BOLT-INSTRUMENTER: Total size of descriptors: 1396 bytes in file
BOLT-INSTRUMENTER: Profile will be saved to file /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp.fdata
BOLT-INFO: 0 out of 12 functions in the binary (0.0%) have non-empty execution profile
BOLT-INFO: UCE removed 0 blocks and 0 bytes of code.
BOLT-INFO: SCTC: patched 0 tail calls (0 forward) tail calls (0 backward) from a total of 0 while removing 0 double jumps and removing 0 basic blocks totalling 0 bytes of code. CTCs total execution count is 0 and the number of times CTCs are taken is 0.

--
Command Output (stderr):
--
 #0 0x000055b3caf01191 PrintStackTraceSignalHandler(void*) Signals.cpp:0:0
 #1 0x000055b3caefed7e SignalHandler(int) Signals.cpp:0:0
 #2 0x00007fd7a04c6870 __restore_rt sigaction.c:0:0
 #3 0x000055b3cb030126 llvm::RuntimeDyld::getSymbol(llvm::StringRef) const (/home/grayshade/build/bin/llvm-bolt+0xcc0126)
 #4 0x000055b3cad9de07 (anonymous namespace)::BOLTSymbolResolver::lookup(std::set<llvm::StringRef, std::less<llvm::StringRef>, std::allocator<llvm::StringRef> > const&, llvm::unique_function<void (llvm::Expected<std::map<llvm::StringRef, llvm::JITEvaluatedSymbol, std::less<llvm::StringRef>, std::allocator<std::pair<llvm::StringRef const, llvm::JITEvaluatedSymbol> > > >)>) RewriteInstance.cpp:0:0
 #5 0x000055b3cb02d28d llvm::RuntimeDyldImpl::resolveExternalSymbols() (/home/grayshade/build/bin/llvm-bolt+0xcbd28d)
 #6 0x000055b3cb02f5a8 llvm::RuntimeDyldImpl::resolveRelocations() (/home/grayshade/build/bin/llvm-bolt+0xcbf5a8)
 #7 0x000055b3cb02fd9d llvm::RuntimeDyld::finalizeWithMemoryManagerLocking() (/home/grayshade/build/bin/llvm-bolt+0xcbfd9d)
 #8 0x000055b3caf135a0 llvm::bolt::InstrumentationRuntimeLibrary::link(llvm::bolt::BinaryContext&, llvm::StringRef, llvm::RuntimeDyld&, std::function<void (llvm::RuntimeDyld&)>) (/home/grayshade/build/bin/llvm-bolt+0xba35a0)
 #9 0x000055b3cadcf05c llvm::bolt::RewriteInstance::emitAndLink() (/home/grayshade/build/bin/llvm-bolt+0xa5f05c)
#10 0x000055b3cadd3073 llvm::bolt::RewriteInstance::run() (/home/grayshade/build/bin/llvm-bolt+0xa63073)
#11 0x000055b3ca649610 main (/home/grayshade/build/bin/llvm-bolt+0x2d9610)
#12 0x00007fd79fef8b25 __libc_start_main (/usr/lib/libc.so.6+0x27b25)
#13 0x000055b3ca6b32ee _start (/home/grayshade/build/bin/llvm-bolt+0x3432ee)
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.  Program arguments: /home/grayshade/build/bin/llvm-bolt /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp.exe -instrument -instrumentation-file=/home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp.fdata -o /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp.instr
/home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.script: line 11: 116804 Segmentation fault      (core dumped) /home/grayshade/build/bin/llvm-bolt /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp.exe -instrument -instrumentation-file=/home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp.fdata -o /home/grayshade/build/tools/bolt/test/X86/Output/asm-dump.c.tmp.instr

--

********************
FAIL: BOLT :: X86/internal-call-instrument.s (93 of 105)
******************** TEST 'BOLT :: X86/internal-call-instrument.s' FAILED ********************
Script:
--
: 'RUN: at line 5';   /home/grayshade/build/bin/llvm-mc -filetype=obj -triple x86_64-unknown-unknown /home/grayshade/llvm-bolt/bolt/test/X86/internal-call-instrument.s -o /home/grayshade/build/tools/bolt/test/X86/Output/internal-call-instrument.s.tmp.o
: 'RUN: at line 7';   /home/grayshade/build/bin/llvm-strip --strip-unneeded /home/grayshade/build/tools/bolt/test/X86/Output/internal-call-instrument.s.tmp.o
: 'RUN: at line 8';   /home/grayshade/build/bin/clang /home/grayshade/build/tools/bolt/test/X86/Output/internal-call-instrument.s.tmp.o -o /home/grayshade/build/tools/bolt/test/X86/Output/internal-call-instrument.s.tmp.exe -Wl,-q
: 'RUN: at line 10';   /home/grayshade/build/bin/llvm-bolt -instrument /home/grayshade/build/tools/bolt/test/X86/Output/internal-call-instrument.s.tmp.exe -relocs -o /home/grayshade/build/tools/bolt/test/X86/Output/internal-call-instrument.s.tmp.out
--
Exit Code: 139

Command Output (stdout):
--
BOLT-INFO: Target architecture: x86_64
BOLT-INFO: BOLT version: 05535cefaa6f57f2fb6f6b732dbb8208f6a7b6bb
BOLT-INFO: first alloc address is 0x400000
BOLT-INFO: creating new program header table at address 0x600000, offset 0x200000
BOLT-INFO: enabling relocation mode
BOLT-INFO: forcing -jump-tables=move for instrumentation
BOLT-INFO: enabling -align-macro-fusion=all since no profile was specified
BOLT-INFO: enabling lite mode
BOLT-INSTRUMENTER: Number of indirect call site descriptors: 3
BOLT-INSTRUMENTER: Number of indirect call target descriptors: 9
BOLT-INSTRUMENTER: Number of function descriptors: 9
BOLT-INSTRUMENTER: Number of branch counters: 5
BOLT-INSTRUMENTER: Number of ST leaf node counters: 12
BOLT-INSTRUMENTER: Number of direct call counters: 0
BOLT-INSTRUMENTER: Total number of counters: 17
BOLT-INSTRUMENTER: Total size of counters: 136 bytes (static alloc memory)
BOLT-INSTRUMENTER: Total size of string table emitted: 162 bytes in file
BOLT-INSTRUMENTER: Total size of descriptors: 1096 bytes in file
BOLT-INSTRUMENTER: Profile will be saved to file /tmp/prof.fdata
BOLT-INFO: 0 out of 11 functions in the binary (0.0%) have non-empty execution profile
BOLT-INFO: UCE removed 0 blocks and 0 bytes of code.
BOLT-INFO: SCTC: patched 0 tail calls (0 forward) tail calls (0 backward) from a total of 0 while removing 0 double jumps and removing 0 basic blocks totalling 0 bytes of code. CTCs total execution count is 0 and the number of times CTCs are taken is 0.

--
Command Output (stderr):
--
BOLT-WARNING: will skip the following function(s) as unsupported internal calls were detected:
              main
 #0 0x000055b753943191 PrintStackTraceSignalHandler(void*) Signals.cpp:0:0
 #1 0x000055b753940d7e SignalHandler(int) Signals.cpp:0:0
 #2 0x00007f65bb074870 __restore_rt sigaction.c:0:0
 #3 0x000055b753a72126 llvm::RuntimeDyld::getSymbol(llvm::StringRef) const (/home/grayshade/build/bin/llvm-bolt+0xcc0126)
 #4 0x000055b7537dfe07 (anonymous namespace)::BOLTSymbolResolver::lookup(std::set<llvm::StringRef, std::less<llvm::StringRef>, std::allocator<llvm::StringRef> > const&, llvm::unique_function<void (llvm::Expected<std::map<llvm::StringRef, llvm::JITEvaluatedSymbol, std::less<llvm::StringRef>, std::allocator<std::pair<llvm::StringRef const, llvm::JITEvaluatedSymbol> > > >)>) RewriteInstance.cpp:0:0
 #5 0x000055b753a6f28d llvm::RuntimeDyldImpl::resolveExternalSymbols() (/home/grayshade/build/bin/llvm-bolt+0xcbd28d)
 #6 0x000055b753a715a8 llvm::RuntimeDyldImpl::resolveRelocations() (/home/grayshade/build/bin/llvm-bolt+0xcbf5a8)
 #7 0x000055b753a71d9d llvm::RuntimeDyld::finalizeWithMemoryManagerLocking() (/home/grayshade/build/bin/llvm-bolt+0xcbfd9d)
 #8 0x000055b7539555a0 llvm::bolt::InstrumentationRuntimeLibrary::link(llvm::bolt::BinaryContext&, llvm::StringRef, llvm::RuntimeDyld&, std::function<void (llvm::RuntimeDyld&)>) (/home/grayshade/build/bin/llvm-bolt+0xba35a0)
 #9 0x000055b75381105c llvm::bolt::RewriteInstance::emitAndLink() (/home/grayshade/build/bin/llvm-bolt+0xa5f05c)
#10 0x000055b753815073 llvm::bolt::RewriteInstance::run() (/home/grayshade/build/bin/llvm-bolt+0xa63073)
#11 0x000055b75308b610 main (/home/grayshade/build/bin/llvm-bolt+0x2d9610)
#12 0x00007f65baaa6b25 __libc_start_main (/usr/lib/libc.so.6+0x27b25)
#13 0x000055b7530f52ee _start (/home/grayshade/build/bin/llvm-bolt+0x3432ee)
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.  Program arguments: /home/grayshade/build/bin/llvm-bolt -instrument /home/grayshade/build/tools/bolt/test/X86/Output/internal-call-instrument.s.tmp.exe -relocs -o /home/grayshade/build/tools/bolt/test/X86/Output/internal-call-instrument.s.tmp.out
/home/grayshade/build/tools/bolt/test/X86/Output/internal-call-instrument.s.script: line 4: 116758 Segmentation fault      (core dumped) /home/grayshade/build/bin/llvm-bolt -instrument /home/grayshade/build/tools/bolt/test/X86/Output/internal-call-instrument.s.tmp.exe -relocs -o /home/grayshade/build/tools/bolt/test/X86/Output/internal-call-instrument.s.tmp.out

--

(and so on)

********************
********************
Failed Tests (13):
  BOLT :: X86/asm-dump.c
  BOLT :: X86/internal-call-instrument.s
  BOLT :: runtime/X86/basic-instrumentation.test
  BOLT :: runtime/X86/fdata-escape-chars.ll
  BOLT :: runtime/X86/instrumentation-dup-jts.s
  BOLT :: runtime/X86/instrumentation-ind-calls.s
  BOLT :: runtime/X86/instrumentation-indirect.c
  BOLT :: runtime/X86/instrumentation-pie.c
  BOLT :: runtime/X86/instrumentation-shlib.c
  BOLT :: runtime/X86/internal-call-instrument.test
  BOLT :: runtime/X86/user-func-reorder.c
  BOLT :: runtime/exceptions-instrumentation.test
  BOLT :: runtime/meta-merge-fdata.test

Testing Time: 6.65s
  Unsupported:  5
  Passed     : 87
  Failed     : 13
yavtuk commented 2 years ago

Do you run instrumentation from command line directly or you use *.sh script? myabe it sounds like a magic but I faced with the same issue somehow before when I used Debug build and bash. I couldn't reproduce it using release build.

lnicola commented 2 years ago

From the command line, but if the unit tests fail, there might be a problem on my system.

yavtuk commented 2 years ago

can you check that bolt/CMakeLists.txt contains Release value as well?

-DCMAKE_BUILD_TYPE=Release

lnicola commented 2 years ago

It does. I also used Release when building it.

yota9 commented 2 years ago

@lnicola Could you try to use "--conservative-instrumentation" flag?

lnicola commented 2 years ago

Still crashes:

BOLT-INSTRUMENTER: Number of indirect call site descriptors: 1541
BOLT-INSTRUMENTER: Number of indirect call target descriptors: 472
BOLT-INSTRUMENTER: Number of function descriptors: 472
BOLT-INSTRUMENTER: Number of branch counters: 13449
BOLT-INSTRUMENTER: Number of ST leaf node counters: 0
BOLT-INSTRUMENTER: Number of direct call counters: 821
BOLT-INSTRUMENTER: Total number of counters: 14270
BOLT-INSTRUMENTER: Total size of counters: 114160 bytes (static alloc memory)
BOLT-INSTRUMENTER: Total size of string table emitted: 42253 bytes in file
BOLT-INSTRUMENTER: Total size of descriptors: 437828 bytes in file
BOLT-INSTRUMENTER: Profile will be saved to file /tmp/prof.fdata
BOLT-INFO: 0 out of 515 functions in the binary (0.0%) have non-empty execution profile
BOLT-INFO: the input contains 74 (dynamic count : 0) opportunities for macro-fusion optimization that are going to be fixed
BOLT-INFO: UCE removed 19 blocks and 38 bytes of code.
BOLT-INFO: SCTC: patched 0 tail calls (0 forward) tail calls (0 backward) from a total of 0 while removing 0 double jumps and removing 0 basic blocks totalling 0 bytes of code. CTCs total execution count is 0 and the number of times CTCs are taken is 0.

Thread 1 "llvm-bolt" received signal SIGSEGV, Segmentation fault.
0x0000555556214126 in llvm::RuntimeDyld::getSymbol(llvm::StringRef) const ()
(gdb) bt
#0  0x0000555556214126 in llvm::RuntimeDyld::getSymbol(llvm::StringRef) const ()
#1  0x0000555555f81e07 in (anonymous namespace)::BOLTSymbolResolver::lookup(std::set<llvm::StringRef, std::less<llvm::StringRef>, std::allocator<llvm::StringRef> > const&, llvm::unique_function<void (llvm::Expected<std::map<llvm::StringRef, llvm::JITEvaluatedSymbol, std::less<llvm::StringRef>, std::allocator<std::pair<llvm::StringRef const, llvm::JITEvaluatedSymbol> > > >)>) ()
#2  0x000055555621128d in llvm::RuntimeDyldImpl::resolveExternalSymbols() ()
#3  0x00005555562135a8 in llvm::RuntimeDyldImpl::resolveRelocations() ()
#4  0x0000555556213d9d in llvm::RuntimeDyld::finalizeWithMemoryManagerLocking() ()
#5  0x00005555560f75a0 in llvm::bolt::InstrumentationRuntimeLibrary::link(llvm::bolt::BinaryContext&, llvm::StringRef, llvm::RuntimeDyld&, std::function<void (llvm::RuntimeDyld&)>) ()
#6  0x0000555555fb305c in llvm::bolt::RewriteInstance::emitAndLink() ()
#7  0x0000555555fb7073 in llvm::bolt::RewriteInstance::run() ()
#8  0x000055555582d610 in main ()
yota9 commented 2 years ago

@lnicola I had a crash like this one time. But I don't remember what was the problem exactly. Probably I've forgot to add the symbol to the instrumentation library or something like this. There is lookup function in RewriteInstance with the code:

LLVM_DEBUG(dbgs() << "BOLT: looking for " << SymName << "\n");
JITEvaluatedSymbol Result = RTDyld.getSymbol(SymName);

Could you please remove LLVM_DEBUG so we could see the symbol name it is crashing on?

lnicola commented 2 years ago
BOLT: looking for __TMC_LIST__/1
BOLT: looking for __bolt_instr_tables
BOLT: looking for __do_global_dtors_aux_fini_array_entry/1
BOLT: looking for __dso_handle
BOLT: looking for _rust_extern_with_linkage___dso_handle/1
BOLT: looking for anon.c37c96173e096a0fdef6297fb8f2f494.0.llvm.13149066270489800442
BOLT: looking for str.4/1
BOLT: looking for str.5/1
BOLT: looking for __stack_chk_fail
 #0 0x000055640fb54491 PrintStackTraceSignalHandler(void*) Signals.cpp:0:0
 #1 0x000055640fb5207e SignalHandler(int) Signals.cpp:0:0
 #2 0x00007fc89cc3e870 __restore_rt sigaction.c:0:0
 #3 0x000055640fc83426 llvm::RuntimeDyld::getSymbol(llvm::StringRef) const (/home/grayshade/build/bin/llvm-bolt+0xcc0426)
 #4 0x000055640f9f1113 (anonymous namespace)::BOLTSymbolResolver::lookup(std::set<llvm::StringRef, std::less<llvm::StringRef>, std::allocator<llvm::StringRef> > const&, llvm::unique_function<void (llvm::Expected<std::map<llvm::StringRef, llvm::JITEvaluatedSymbol, std::less<llvm::StringRef>, std::allocator<std::pair<llvm::StringRef const, llvm::JITEvaluatedSymbol> > > >)>) RewriteInstance.cpp:0:0
 #5 0x000055640fc8058d llvm::RuntimeDyldImpl::resolveExternalSymbols() (/home/grayshade/build/bin/llvm-bolt+0xcbd58d)
 #6 0x000055640fc828a8 llvm::RuntimeDyldImpl::resolveRelocations() (/home/grayshade/build/bin/llvm-bolt+0xcbf8a8)
 #7 0x000055640fc8309d llvm::RuntimeDyld::finalizeWithMemoryManagerLocking() (/home/grayshade/build/bin/llvm-bolt+0xcc009d)
 #8 0x000055640fb668a0 llvm::bolt::InstrumentationRuntimeLibrary::link(llvm::bolt::BinaryContext&, llvm::StringRef, llvm::RuntimeDyld&, std::function<void (llvm::RuntimeDyld&)>) (/home/grayshade/build/bin/llvm-bolt+0xba38a0)
 #9 0x000055640fa2235c llvm::bolt::RewriteInstance::emitAndLink() (/home/grayshade/build/bin/llvm-bolt+0xa5f35c)
#10 0x000055640fa26373 llvm::bolt::RewriteInstance::run() (/home/grayshade/build/bin/llvm-bolt+0xa63373)
#11 0x000055640f29c610 main (/home/grayshade/build/bin/llvm-bolt+0x2d9610)
#12 0x00007fc89c670b25 __libc_start_main (/usr/lib/libc.so.6+0x27b25)
#13 0x000055640f3062ee _start (/home/grayshade/build/bin/llvm-bolt+0x3432ee)
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
yota9 commented 2 years ago

@lnicola Could you please try to add -fno-stack-protector flag to the bolt/runtime/CMakeLists.txt in target_compile_options(bolt_rt_instr ? Don't forget to rebuild the instrumentation library after this.

lnicola commented 2 years ago

That worked, thanks!

yota9 commented 2 years ago

@lnicola Great, I will prepare the patch

yota9 commented 2 years ago

That time I would like to ask for the FB team @maksfb @rafaelauler about such problem in llvm::RuntimeDyld::getSymbol. I don't think it is expected behavior to segfault in case we don't find the symbol. I didn't investigate this problem yet, but maybe you already know where the problem might occur :)

rafaelauler commented 2 years ago

Thanks for pointing that out, @yota9. I was doing something silly. I will put up the fix soon.

Originally, we would try to use LLVM's linker again to resolve a missing symbol (it was a circular dependency, the linker failed to locate something, it called our callback to resolve it, and we would call the linker again asking it to look it up in its loaded objects, one of which was the bolt-produced object file).

That was because of an older ORC-based design, which forced us to query ORC for the symbols exported from the bolt-produced object. But I refactored that and removed ORC, so we use RTDyld directly, which will already resolve all symbols correctly without the need for our resolver to kick in. So this function essentially never runs, unless something fishy is happening (the symbol was not found). When it runs, it was crashing because it was using a nullptr reference to RTDyld.

lnicola commented 2 years ago

Fixed in https://github.com/facebookincubator/BOLT/issues/217#issuecomment-970089077