llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.92k stars 11.53k forks source link

[BOLT] perf2bolt, kernel: out of range traces ... 100% #99373

Open romanovj opened 1 month ago

romanovj commented 1 month ago

Kernel linux-cachyos-bore GCC-14 or Clang-17 without LTO

Profile collected with: perf record -a -e cycles -j any,k -F 1000 -- sleep 60

perf2bolt -p perf.data -o perf.fdata vmlinux
PERF2BOLT: Starting data aggregation job for perf.data
PERF2BOLT: spawning perf job to read branch events
PERF2BOLT: spawning perf job to read mem events
PERF2BOLT: spawning perf job to read process events
PERF2BOLT: spawning perf job to read task events
BOLT-INFO: Target architecture: x86_64
BOLT-INFO: BOLT version: 48f55ba9d8e51cf976a8789521dd0763dea1e2d1
BOLT-INFO: Linux kernel binary detected
BOLT-INFO: first alloc address is 0x0
BOLT-INFO: static input executable detected
BOLT-INFO: enabling lite mode
BOLT-WARNING: split function detected on input : do_one_initcall.cold
BOLT-ERROR: symbol seen in the middle of the function srso_untrain_ret/1(*2). Skipping.
BOLT-ERROR: symbol seen in the middle of the function retbleed_untrain_ret/1(*2). Skipping.
BOLT-INFO: pre-processing profile using perf data aggregator
BOLT-INFO: binary build-id is:     e17bb67894db56faa09f73703cc90ef96aecbad5
PERF2BOLT: spawning perf job to read buildid list
PERF2BOLT-WARNING: build-id matched a different file name
PERF2BOLT: waiting for perf task events collection to finish...
PERF2BOLT: parsing perf-script task events output
PERF2BOLT: input binary is associated with 0 PID(s)
PERF2BOLT: waiting for perf events collection to finish...
PERF2BOLT: parse branch events...
PERF2BOLT: read 39377 samples and 629388 LBR entries
PERF2BOLT: 0 samples (0.0%) were ignored
PERF2BOLT: traces mismatching disassembled function contents: 0 (0.0%)
PERF2BOLT: out of range traces involving unknown regions: 590011 (100.0%)
PERF2BOLT: waiting for perf mem events collection to finish...
BOLT-INFO: parsed 12288 SMP lock entries
BOLT-INFO: parsed 0 static call entries
BOLT-INFO: parsed 697 exception table entries
BOLT-INFO: parsed 12430 bug table entries
BOLT-INFO: setting --alt-inst-has-padlen=0
BOLT-INFO: setting --alt-inst-feature-size=4
BOLT-INFO: parsed 10660 alternative instruction entries
BOLT-INFO: parsed 480936 ORC entries
BOLT-INFO: parsed 953 PCI fixup entries
BOLT-INFO: parsed 8364 static keys jump entries
BOLT-WARNING: Running parallel work of 0 estimated cost, will switch to  trivial scheduling.
PERF2BOLT: processing branch events...
PERF2BOLT: wrote 0 objects and 0 memory objects to perf.fdata
BOLT-INFO: 0 out of 113035 functions in the binary (0.0%) have non-empty execution profile

100MB vmlinux + perf.data https://disk.yandex.com/d/uTngs08s5dt-lw

hun-zi-shang-fen commented 1 month ago

We encountered the same issue and eventually found that it was caused by kaslr. Disabling kaslr resolved the issue. https://github.com/llvm/llvm-project/pull/98153 is a discussion on this problem.

ptr1337 commented 1 month ago

@romanovj Please be aware, that these patches in the kernel-patches dir are still pretty new and not much tested. Also, for BOLTING the kernel you need an unstripped vmlinux. as @hun-zi-shang-fen mentioned, adding "nokaslr" solves the profile conversion