facebookarchive / BOLT

Binary Optimization and Layout Tool - A linux command-line utility used for optimizing performance of binaries
2.52k stars 178 forks source link

perf2bolt: crashes with assertion. #292

Open LeiW000 opened 2 years ago

LeiW000 commented 2 years ago

I tried to bolt a PGO-based php-fpm binary. When I run perf2bolt, the following assertion was shown up.

perf2bolt -p perf.data -o /workspace/perf.fdata /opt/pkb/git/hhvm-perf/php-fpm
.......
BOLT-INFO: shared object or position-independent executable detected
PERF2BOLT: Starting data aggregation job for perf.data
PERF2BOLT: spawning perf job to read branch events
PERF2BOLT: spawning perf job to read mem events
PERF2BOLT: spawning perf job to read process events
PERF2BOLT: spawning perf job to read task events
BOLT-INFO: Target architecture: x86_64
BOLT-INFO: BOLT version: c62053979489ccb002efe411c3af059addcb5d7d
BOLT-INFO: first alloc address is 0x0
BOLT-INFO: creating new program header table at address 0x1400000, offset 0x1400000
BOLT-INFO: enabling relocation mode
BOLT-INFO: enabling strict relocation mode for aggregation purposes
BOLT-WARNING: split function detected on input : OnUpdate_date_timezone.cold.13/1. The support is limited in relocation mode.
BOLT-INFO: pre-processing profile using perf data aggregator
BOLT-INFO: binary build-id is:     802999cf701009d2cc369782da77829e749a8ff3
PERF2BOLT: spawning perf job to read buildid list
PERF2BOLT: matched build-id and file name
PERF2BOLT: waiting for perf mmap events collection to finish...
PERF2BOLT: parsing perf-script mmap events output
PERF2BOLT: waiting for perf task events collection to finish...
PERF2BOLT: parsing perf-script task events output
PERF2BOLT: input binary is associated with 73 PID(s)
PERF2BOLT: waiting for perf events collection to finish...
PERF2BOLT: parse branch events...
PERF2BOLT: read 15 samples and 359 LBR entries
PERF2BOLT: 7860 samples (99.8%) were ignored
PERF2BOLT-WARNING: less than 50% of all recorded samples were attributed to the input binary
PERF2BOLT: traces mismatching disassembled function contents: 0 (0.0%)
PERF2BOLT: out of range traces involving unknown regions: 84 (24.3%)
BOLT-WARNING: interprocedural reference between unrelated fragments: zend_fetch_this_var/1(*2) and ZEND_ASSIGN_SPEC_VAR_TMP_RETVAL_UNUSED_HANDLER.cold.91/1(*2)
BOLT-WARNING: interprocedural reference between unrelated fragments: zend_ast_with_attributes/1 and zend_object_std_init.cold.0/1(*2)
BOLT-WARNING: interprocedural reference between unrelated fragments: zend_compile_assign/1 and zend_ensure_writable_variable.cold.34/1(*2)
perf2bolt: /home/bolt/llvm-project/bolt/lib/Core/BinaryContext.cpp:683: void llvm::bolt::BinaryContext::populateJumpTables(): Assertion `0 && "unclaimed PC-relative relocations left in data\n"' failed.
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
perf2bolt(+0x9207e4)[0x555555e747e4]
perf2bolt(+0x91e35e)[0x555555e7235e]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x143c0)[0x7ffff7fb53c0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7ffff7aa003b]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7ffff7a7f859]
/lib/x86_64-linux-gnu/libc.so.6(+0x22729)[0x7ffff7a7f729]
/lib/x86_64-linux-gnu/libc.so.6(+0x34006)[0x7ffff7a91006]
perf2bolt(+0x1704d61)[0x555556c58d61]
perf2bolt(+0x6d96c9)[0x555555c2d6c9]
perf2bolt(+0x743bae)[0x555555c97bae]
perf2bolt(+0x1ed5fa)[0x5555557415fa]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7ffff7a810b3]
perf2bolt(+0x24c37e)[0x5555557a037e]
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
pkb@53cb3a5fdd24:/opt/pkb/git/hhvm-perf$ readelf -S php-fpm
There are 51 section headers, starting at offset 0x54aae78:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .interp           PROGBITS         00000000000002a8  000002a8
       000000000000001c  0000000000000000   A       0     0     1
  [ 2] .note.gnu.build-i NOTE             00000000000002c4  000002c4
       0000000000000024  0000000000000000   A       0     0     4
  [ 3] .note.ABI-tag     NOTE             00000000000002e8  000002e8
       0000000000000020  0000000000000000   A       0     0     4
  [ 4] .gnu.hash         GNU_HASH         0000000000000308  00000308
       0000000000004a18  0000000000000000   A       5     0     8
  [ 5] .dynsym           DYNSYM           0000000000004d20  00004d20
       0000000000010950  0000000000000018   A       6     1     8
  [ 6] .dynstr           STRTAB           0000000000015670  00015670
       000000000000e289  0000000000000000   A       0     0     1
  [ 7] .gnu.version      VERSYM           00000000000238fa  000238fa
       000000000000161c  0000000000000002   A       5     0     2
  [ 8] .gnu.version_r    VERNEED          0000000000024f18  00024f18
       00000000000002a0  0000000000000000   A       6     6     8
  [ 9] .rela.dyn         RELA             00000000000251b8  000251b8
       00000000000c93d8  0000000000000018   A       5     0     8
  [10] .rela.plt         RELA             00000000000ee590  000ee590
       0000000000003990  0000000000000018  AI       5    30     8
  [11] .init             PROGBITS         0000000000200000  00200000
       000000000000001b  0000000000000000  AX       0     0     4
  [12] .rela.init        RELA             0000000000000000  029fa7b8
       0000000000000018  0000000000000018   I      48    11     8
  [13] .plt              PROGBITS         0000000000200020  00200020
       0000000000002670  0000000000000010  AX       0     0     16
  [14] .plt.got          PROGBITS         0000000000202690  00202690
       0000000000000040  0000000000000008  AX       0     0     8
  [15] .text             PROGBITS         00000000002026d0  002026d0
       000000000036c205  0000000000000000  AX       0     0     16
  [16] .rela.text        RELA             0000000000000000  029fa7d0
       00000000002a23a0  0000000000000018   I      48    15     8
  [17] .fini             PROGBITS         000000000056e8d8  0056e8d8
       000000000000000d  0000000000000000  AX       0     0     4
  [18] .rodata           PROGBITS         0000000000600000  00600000
       000000000079dac4  0000000000000000   A       0     0     32
  [19] .rela.rodata      RELA             0000000000000000  02c9cb70
       000000000004d310  0000000000000018   I      48    18     8
  [20] .eh_frame_hdr     PROGBITS         0000000000d9dac4  00d9dac4
       0000000000013a4c  0000000000000000   A       0     0     4
  [21] .eh_frame         PROGBITS         0000000000db1510  00db1510
       0000000000067e00  0000000000000000   A       0     0     8
  [22] .rela.eh_frame    RELA             0000000000000000  02ce9e80
       000000000003aea8  0000000000000018   I      48    21     8
  [23] .init_array       INIT_ARRAY       0000000001166288  00f66288
       0000000000000010  0000000000000008  WA       0     0     8
  [24] .rela.init_array  RELA             0000000000000000  02d24d28
       0000000000000030  0000000000000018   I      48    23     8
  [25] .fini_array       FINI_ARRAY       0000000001166298  00f66298
       0000000000000008  0000000000000008  WA       0     0     8
  [26] .rela.fini_array  RELA             0000000000000000  02d24d58
       0000000000000018  0000000000000018   I      48    25     8
  [27] .data.rel.ro      PROGBITS         00000000011662a0  00f662a0
       00000000000985c0  0000000000000000  WA       0     0     32
  [28] .rela.data.rel.ro RELA             0000000000000000  02d24d70
       00000000000c3c60  0000000000000018   I      48    27     8
  [29] .dynamic          DYNAMIC          00000000011fe860  00ffe860
       0000000000000260  0000000000000010  WA       6     0     8
  [30] .got              PROGBITS         00000000011feac0  00ffeac0
       0000000000001538  0000000000000008  WA       0     0     8
  [31] .data             PROGBITS         0000000001200000  01000000
       0000000000003858  0000000000000000  WA       0     0     32
  [32] .rela.data        RELA             0000000000000000  02de89d0
       00000000000050b8  0000000000000018   I      48    31     8
  [33] .tm_clone_table   PROGBITS         0000000001203858  01003858
       0000000000000000  0000000000000000  WA       0     0     8
  [34] .bss              NOBITS           0000000001203860  01003858
       00000000000206c8  0000000000000000  WA       0     0     32
  [35] .comment          PROGBITS         0000000000000000  01003858
       0000000000000023  0000000000000001  MS       0     0     1
  [36] .debug_aranges    PROGBITS         0000000000000000  0100387b
       0000000000008f10  0000000000000000           0     0     1
  [37] .rela.debug_arang RELA             0000000000000000  02deda88
       000000000000b388  0000000000000018   I      48    36     8
  [38] .debug_info       PROGBITS         0000000000000000  0100c78b
       0000000000c4293a  0000000000000000           0     0     1
  [39] .rela.debug_info  RELA             0000000000000000  02df8e10
       000000000131c700  0000000000000018   I      48    38     8
  [40] .debug_abbrev     PROGBITS         0000000000000000  01c4f0c5
       0000000000075be6  0000000000000000           0     0     1
  [41] .debug_line       PROGBITS         0000000000000000  01cc4cab
       0000000000181cc8  0000000000000000           0     0     1
  [42] .rela.debug_line  RELA             0000000000000000  04115510
       0000000000003c60  0000000000000018   I      48    41     8
  [43] .debug_str        PROGBITS         0000000000000000  01e46973
       0000000000097f7b  0000000000000001  MS       0     0     1
  [44] .debug_loc        PROGBITS         0000000000000000  01ede8ee
       00000000008c8fb1  0000000000000000           0     0     1
  [45] .rela.debug_loc   RELA             0000000000000000  04119170
       000000000108acb0  0000000000000018   I      48    44     8
  [46] .debug_ranges     PROGBITS         0000000000000000  027a789f
       000000000015f580  0000000000000000           0     0     1
  [47] .rela.debug_range RELA             0000000000000000  051a3e20
       0000000000306ea0  0000000000000018   I      48    46     8
  [48] .symtab           SYMTAB           0000000000000000  02906e20
       0000000000092010  0000000000000018          49   22090     8
  [49] .strtab           STRTAB           0000000000000000  02998e30
       0000000000061986  0000000000000000           0     0     1
  [50] .shstrtab         STRTAB           0000000000000000  054aacc0
       00000000000001b5  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  l (large), p (processor specific)

The following is the version info of perf2bolt

pkb@53cb3a5fdd24:/opt/pkb/git/hhvm-perf$ perf2bolt --version
LLVM (http://llvm.org/):
  LLVM version 14.0.1
  Optimized build with assertions.
  Default target: x86_64-unknown-linux-gnu
  Host CPU: icelake-server

BOLT revision c62053979489ccb002efe411c3af059addcb5d7d
maksfb commented 2 years ago

Thanks for the report. Please use -strict=0 option as a workaround.

aaupov commented 2 years ago

Another potential cause is the presence of split functions:

BOLT-WARNING: split function detected on input : OnUpdate_date_timezone.cold.13

BOLT is currently incompatible with the -freorder-blocks-and-partition compiler option producing split functions. Since GCC8 enables this option by default, you have to explicitly disable it by adding -fno-reorder-blocks-and-partition flag if you are compiling with GCC8 or above.

LeiW000 commented 2 years ago

@maksfb, thanks. I need some time to try that option. Will update you if it can work later.

@aaupov , Thanks for the quick response. I used GCC7 to build php-fpm. My understanding is that I don't have to add that flag. Am I right?

aaupov commented 2 years ago

@maksfb, thanks. I need some time to try that option. Will update you if it can work later.

@aaupov , Thanks for the quick response. I used GCC7 to build php-fpm. My understanding is that I don't have to add that flag. Am I right?

Check your compiler flags, and if they include -freorder-blocks-and-partition, just remove it. GCC7 has this optimization but doesn't enable it by default.

LeiW000 commented 2 years ago

Thanks for the report. Please use -strict=0 option as a workaround.

@maksfb , it looks the option can work as a workaround. May I know what "-strict=0" means? I don't find anything about it in the official document.