llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.22k stars 11.65k forks source link

How to optimize -pie compiled binary using perf2bolt and llvm-bolt #71899

Open guyin456 opened 10 months ago

guyin456 commented 10 months ago

Dear, community I am running into a problem when using perf2bolt and llvm-bolt optimize -fpie compiled binary. The phenomenon is that perf2bolt analysis cannot disassemble the binary and there are many invalid profile in the llvm-bolt log.

This is my relevant tool version clang : 11.1.0 perf2bolt/llvm-bolt : 14.0.0

This is log

perf2bolt -p ./perf.data -o perf.fdata  XXX
BOLT-INFO: shared object or position-independent executable detected
PERF2BOLT: Starting data aggregation job for ./perf.data
PERF2BOLT: spawning perf job to read branch events
PERF2BOLT: spawning perf job to read mem events
PERF2BOLT: spawning perf job to read process events
PERF2BOLT: spawning perf job to read task events
BOLT-INFO: Target architecture: x86_64
BOLT-INFO: BOLT version: 88c70afe9d388ad430cc150cc158641701397f70
BOLT-INFO: first alloc address is 0x0
BOLT-INFO: creating new program header table at address 0x10200000, offset 0x10200000
BOLT-INFO: disabling -align-macro-fusion in non-relocation mode
BOLT-INFO: enabling lite mode
BOLT-INFO: pre-processing profile using perf data aggregator
BOLT-WARNING: build-id will not be checked because we could not read one from input binary
PERF2BOLT: waiting for perf mmap events collection to finish...
PERF2BOLT: parsing perf-script mmap events output
PERF2BOLT: waiting for perf task events collection to finish...
PERF2BOLT: parsing perf-script task events output
PERF2BOLT: input binary is associated with 1 PID(s)
PERF2BOLT: waiting for perf events collection to finish...
PERF2BOLT: parse branch events...
PERF2BOLT: read 1188382 samples and 27151892 LBR entries
PERF2BOLT: 12 samples (0.0%) were ignored
PERF2BOLT: traces mismatching disassembled function contents: 3305402 (12.6%)

 !! WARNING !! This high mismatch ratio indicates the input binary is probably not the same binary used during profiling collection. The generated data may be ineffective for improving performance.

PERF2BOLT: out of range traces involving unknown regions: 2742313 (10.4%)
BOLT-WARNING: Ignored 0 functions due to cold fragments.
BOLT-INFO: forcing -jump-tables=move as PIC jump table was detected in function _ZN2ks5infra8KsConfigIbE18ResolveConfigStageERKN8kuaishou6config10ConfigDataE
BOLT-WARNING: 6 collisions detected while hashing binary objects. Use -v=1 to see the list.
PERF2BOLT: processing branch events...
PERF2BOLT: wrote 147489 objects and 0 memory objects to perf.fdata

I would be grateful if you could help me solve it

aaupov commented 9 months ago

I don't see anything particularly wrong with the log. Can you please paste llvm-bolt log?