Closed Ravenslofty closed 2 years ago
I used version 0.1.0
of cargo-pgo
, downloaded yukari
(what commit SHA should I use?), set rustc
to 2befdefdd
and compiled BOLT 6bb51bf06214af3690af7034f4edeb265732c481
. Then I tried the same command, but it generated the instrumented binary correctly :confused: So I'm also not sure what's going on.
Could you try to install the following version of the plugin:
$ cargo install --git https://github.com/kobzol/cargo-pgo --branch bolt-logging
And then run cargo pgo bolt build
with debug logging?
$ RUST_LOG=cargo_pgo=debug cargo pgo bolt build
$ RUST_LOG=cargo_pgo=debug cargo pgo bolt build
[2022-08-08T15:09:01Z INFO cargo_pgo::bolt::instrument] BOLT profile directory will be cleared.
[2022-08-08T15:09:01Z INFO cargo_pgo::bolt::instrument] BOLT profiles will be stored into /home/lofty/yukari/target/bolt-profiles.
[2022-08-08T15:09:01Z DEBUG cargo_pgo] Running command "rustc" "-vV"
[2022-08-08T15:09:01Z DEBUG cargo_pgo::build] Executing cargo command: "cargo" "build" "--release" "--message-format" "json-diagnostic-rendered-ansi" "--target" "x86_64-unknown-linux-gnu"
[2022-08-08T15:09:01Z INFO cargo_pgo::bolt::instrument] Binary yukari built successfully. It will be now instrumented with BOLT.
[2022-08-08T15:09:01Z DEBUG cargo_pgo] Running command "/home/lofty/llvm-project/llvm-install/bin/llvm-bolt" "-instrument" "/home/lofty/yukari/target/x86_64-unknown-linux-gnu/release/yukari" "--instrumentation-file-append-pid" "--instrumentation-file" "/home/lofty/yukari/target/bolt-profiles/yukari/profile" "-update-debug-sections" "-o" "/home/lofty/yukari/target/x86_64-unknown-linux-gnu/release/yukari-bolt-instrumented"
[2022-08-08T15:09:01Z DEBUG cargo_pgo::bolt::instrument] BOLT instrumentation stdout
BOLT-INFO: shared object or position-independent executable detected
BOLT-INFO: Target architecture: x86_64
BOLT-INFO: BOLT version: 6bb51bf06214af3690af7034f4edeb265732c481
BOLT-INFO: first alloc address is 0x0
BOLT-INFO: creating new program header table at address 0x200000, offset 0x200000
BOLT-INFO: enabling relocation mode
BOLT-INFO: forcing -jump-tables=move for instrumentation
BOLT-INFO: enabling -align-macro-fusion=all since no profile was specified
BOLT-INFO: enabling lite mode
BOLT-INSTRUMENTER: Number of indirect call site descriptors: 794
BOLT-INSTRUMENTER: Number of indirect call target descriptors: 450
BOLT-INSTRUMENTER: Number of function descriptors: 450
BOLT-INSTRUMENTER: Number of branch counters: 7191
BOLT-INSTRUMENTER: Number of ST leaf node counters: 4013
BOLT-INSTRUMENTER: Number of direct call counters: 870
BOLT-INSTRUMENTER: Total number of counters: 12074
BOLT-INSTRUMENTER: Total size of counters: 96592 bytes (static alloc memory)
BOLT-INSTRUMENTER: Total size of string table emitted: 42289 bytes in file
BOLT-INSTRUMENTER: Total size of descriptors: 649540 bytes in file
BOLT-INSTRUMENTER: Profile will be saved to file /home/lofty/yukari/target/bolt-profiles/yukari/profile
BOLT-INFO: 0 out of 496 functions in the binary (0.0%) have non-empty execution profile
BOLT-INFO: the input contains 99 (dynamic count : 0) opportunities for macro-fusion optimization that are going to be fixed
BOLT-INFO: 4912 instructions were shortened
BOLT-INFO: removed 41 empty blocks
BOLT-INFO: UCE removed 309 blocks and 18719 bytes of code.
BOLT-INFO: SCTC: patched 0 tail calls (0 forward) tail calls (0 backward) from a total of 0 while removing 0 double jumps and removing 0 basic blocks totalling 0 bytes of code. CTCs total execution count is 0 and the number of times CTCs are taken is 0.
BOLT-INFO: output linked against instrumentation runtime library, lib entry point is 0x4d7dc0
BOLT-INFO: clear procedure is 0x4d6c10
[2022-08-08T15:09:01Z DEBUG cargo_pgo::bolt::instrument] BOLT instrumentation stderr
BOLT-ERROR: skeleton CU at 0x54b does not have DW_AT_GNU_ranges_base or DW_AT_low_pc to convert to update ranges base
BOLT-ERROR: skeleton CU at 0x5b870 does not have DW_AT_GNU_ranges_base or DW_AT_low_pc to convert to update ranges base
#0 0x0000565278cd6b64 PrintStackTraceSignalHandler(void*) Signals.cpp:0:0
#1 0x0000565278cd461b SignalHandler(int) Signals.cpp:0:0
#2 0x00007f476f783a40 (/usr/lib/libc.so.6+0x38a40)
#3 0x0000565279d5bc5e llvm::bolt::DebugAddrWriterDwarf5::getOffset(llvm::DWARFUnit&) (/home/lofty/llvm-project/llvm-install/bin/llvm-bolt+0x2f5ec5e)
#4 0x0000565278d85800 llvm::bolt::DWARFRewriter::finalizeDebugSections(llvm::bolt::DebugInfoBinaryPatcher&) (/home/lofty/llvm-project/llvm-install/bin/llvm-bolt+0x1f88800)
#5 0x0000565278d92e70 llvm::bolt::DWARFRewriter::updateDebugInfo() (/home/lofty/llvm-project/llvm-install/bin/llvm-bolt+0x1f95e70)
#6 0x0000565278d560a8 llvm::bolt::RewriteInstance::updateMetadata() (/home/lofty/llvm-project/llvm-install/bin/llvm-bolt+0x1f590a8)
#7 0x0000565278d7a440 llvm::bolt::RewriteInstance::run() (/home/lofty/llvm-project/llvm-install/bin/llvm-bolt+0x1f7d440)
#8 0x000056527741a1d4 main (/home/lofty/llvm-project/llvm-install/bin/llvm-bolt+0x61d1d4)
#9 0x00007f476f76e2d0 (/usr/lib/libc.so.6+0x232d0)
#10 0x00007f476f76e38a __libc_start_main (/usr/lib/libc.so.6+0x2338a)
#11 0x00005652774f5255 _start /build/glibc/src/glibc/csu/../sysdeps/x86_64/start.S:117:0
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: /home/lofty/llvm-project/llvm-install/bin/llvm-bolt -instrument /home/lofty/yukari/target/x86_64-unknown-linux-gnu/release/yukari --instrumentation-file-append-pid --instrumentation-file /home/lofty/yukari/target/bolt-profiles/yukari/profile -update-debug-sections -o /home/lofty/yukari/target/x86_64-unknown-linux-gnu/release/yukari-bolt-instrumented
[2022-08-08T15:09:01Z INFO cargo_pgo::bolt::instrument] Binary yukari instrumented successfully. Now run /home/lofty/yukari/target/x86_64-unknown-linux-gnu/release/yukari-bolt-instrumented on your workload.
[2022-08-08T15:09:01Z INFO cargo_pgo::bolt::instrument] BOLT instrumentation build finished successfully.
I suppose we have our answer here; BOLT is crashing but cargo-pgo reports success. While I suppose this means it's not a cargo-pgo bug, maybe this would be something to detect (I don't think this counts as a successful instrumentation).
As for the Yukari SHA1: 1dc84868a5ea258aee1f622a61ade511a135ffce.
Thanks for testing it! Indeed, you are right. BOLT chokes on the binary for some reason (seems to be some kind of debug info/DWARF issue), but there's also a bug in cargo-pgo
, because the error wasn't reported. It should be fixed by https://github.com/Kobzol/cargo-pgo/pull/7.
I think I've tracked this down to llvm/llvm-project#56277, and removing -update-debug-sections
from the llvm-bolt
command line appears to work around the issue. Would temporarily disabling the flag be an option?
Nice find! I don't want to outright disable it, but I think that I could allow parametrization of the BOLT flags, e.g. like this:
$ cargo pgo bolt build --bolt-args ""
(empty args, no debug update)
$ cargo pgo bolt optimize --bolt-args "--icf=1 -dyno-stats"
I'll try to implement it soon.
I implemented it in https://github.com/Kobzol/cargo-pgo/pull/10. Could you please try this version:
$ cargo install --git https://github.com/kobzol/cargo-pgo --branch bolt-args
And instrument the binary with $ cargo pgo bolt build --bolt-args ""
, if it works?
Yes, I can confirm both cargo pgo bolt build --bolt-args ""
and cargo pgo bolt optimize --bolt-args ""
work, for a whopping 0.62% speedup over PGO (...probably noise).
Still, I learned something from all this, so thank you all the same.
While BOLT might not always work as advertised, and you need to gather a lot of data before it can do anything (at least a few billion instructions I would say), running optimize
with empty flags will not do that much I think :) Try this:
$ cargo pgo bolt optimize --bolt-args "-reorder-blocks=ext-tsp -reorder-functions=hfsort -split-functions=2 -split-all-cold -jump-tables=move -use-gnu-stack -split-eh -lite=1 -icf=1 -relocs -dyno-stats"
This is basically the list of flags used by cargo-pgo
, but without -update-debug-sections
.
This was while I was working on https://github.com/yukarichess/yukari.
I wish I could provide more to work with, but I don't know where to begin.