Open aaupov opened 1 year ago
@llvm/issue-subscribers-bolt
Should be resolved by https://reviews.llvm.org/D144588
UPD: it's not.
I happen to solve this problem by rebuilding the Linux kernel, which lets the perf2bolt work well on the Ampere processor and ubuntu20.04 system. I am not sure if this solution applies to everyone, but you can check the dmesg information during machine reboot. Before rebuilding the kernel, I got some information like "firmware bug kernel image not aligned on 64k boundary".
Quick update: turns out that the host has 4k pages, which explains the mmapped address:
0xaaaaac8c4000(0x3a64000) @ 0x1e14000
(aligned by 0x1000 which is 4k).
So the proper solution is to align down not by segment alignment, which must be a multiple of page size, but rather page size itself. When aligned by 4k, this particular issue is resolved.
The question is how to extract the page size used while perf sampling:
1) Can we assume that perf2bolt works on the same host as was used to collect perf file? In this case getpagesize()
would work.
2) If not, perf can capture page size of executed ip (PERF_SAMPLE_CODE_PAGE_SIZE
) with
--code-page-size
perf record option:
--code-page-size Record the sampled code address (ip) page size
which can then be extracted from each sample. Not the most efficient way, and requires an extra option, but it'll work.
I tried finding the page size used for mapping in PERF_RECORD_MMAP2 but there's none:
struct {
struct perf_event_header header;
u32 pid;
u32 tid;
u64 addr;
u64 len;
u64 pgoff;
union {
struct {
u32 maj;
u32 min;
u64 ino;
u64 ino_generation;
};
struct { /* if PERF_RECORD_MISC_MMAP_BUILD_ID */
u8 build_id_size;
u8 __reserved_1;
u16 __reserved_2;
u8 build_id[20];
};
};
u32 prot;
u32 flags;
char filename[];
struct sample_id sample_id;
};
Host: Ubuntu 22.04, Ampere Altra A1 on Oracle Cloud Sampled clang-17 bootstrapped binary built from recent trunk using aaupov/llvm-project/nolbr and aaupov/llvm-devmtg-2022/altra.
perf2bolt fails with
The warning hints to the source of the issue: https://github.com/llvm/llvm-project/blob/b884f4ef0a2de3d0f24111411dff663fd68c2eb0/bolt/lib/Profile/DataAggregator.cpp#L2055-L2063
I sprinkled printf debug statements to look at the calculation done in
getBaseAddressForMapping
: https://github.com/llvm/llvm-project/blob/b884f4ef0a2de3d0f24111411dff663fd68c2eb0/bolt/lib/Core/BinaryContext.cpp#L1871-L1880Checking ELF segments:
and
perf script --show-mmap-events
Clearly, we use an incorrect calculation for finding the base address, since Align doesn't necessarily imply an alignment of mmapped address. ELF spec doesn't mandate that: