Open yonghong-song opened 2 months ago
cc @4ast @anakryiko @eddyz87 @jemarch
cc @nikic @david-xl Maybe you can give some suggestions?
If I'm understanding correctly, these are functions that get internalized during LTO? In principle, the correct fix would be to prevent internalization of these symbols.
Is it possible to trace all functions in this way, or are they some subset marked in some way?
I agree that traced symbols should probably not be optimized with signature change. If signature change happens, it is better to change the function name to avoid confusion. Optimization remarks should also report the changes (if not done).
@nikic, those functions are actually optimized by regular optimizations and some of them are optimized during LTO but by regular optimization passes. The symbol promotion (from static to global) during LTO won't change function signature so it is okay.
@david-xl Right. This is what I suggested. We at least need to change function name (e.g., with a suffix like gcc) to signal that function signature may have changed. Otherwise, users may get incorrect result and is very confused about that.
It would be even better to add such func signature change information to dwarf since dwarf is in kernel binary. Currently kernel llvm build does not have remarks. If dwarf is not possible, let us ensure remarks do have such signature change information.
@nikic, those functions are actually optimized by regular optimizations and some of them are optimized during LTO but by regular optimization passes. The symbol promotion (from static to global) during LTO won't change function signature so it is okay.
The optimization happens in "regular optimizations", but it is only possible because the function is internal
, which is what allows LLVM to change the signature. This would not happen if the function were exported. The internalization happens as part of LTO, which is why I'm saying that would be the proper place to fix this.
@nikic, yes, all those functions (changing signatures) are static (internal) functions. Based on above discussion with you and @david-xl, looks like we could modify function name (e.g. with a suffix) to differentiate from original functions. I will look into this.
@david-xl I checked one of kernel file (kernel/bpf/syscall.c), I got the following output with -mllvm -debug-only=deadargelim
DeadArgumentEliminationPass - Removing argument 1 (uattr.coerce0) from map_update_elem
DeadArgumentEliminationPass - Removing argument 1 (uattr.coerce0) from map_delete_elem
DeadArgumentEliminationPass - Removing argument 3 (count) from strncpy_from_bpfptr
I added optiont -foptimization-record-file=t
to save remarks. The following is what I got
$ cat t | grep Name | sort | uniq
Name: AlwaysInline
Name: DontUnroll
Name: FullyUnrolled
Name: Hoisted
Name: Inlined
Name: InstructionCount
Name: InstructionMix
Name: LoadClobbered
Name: LoadElim
Name: LoadWithLoopInvariantAddressCondExecuted
Name: LoadWithLoopInvariantAddressInvalidated
Name: LoopSpillReloadCopies
Name: NeverInline
Name: NoDefinition
Name: PartialUnrolled
Name: PromoteLoopAccessesToScalar
Name: SpillReloadCopies
Name: StackSize
Name: TooCostly
and pass
$ cat t | grep Pass | sort | uniq
Pass: asm-printer
--- !Passed
Pass: gvn
Pass: inline
Pass: licm
Pass: loop-unroll
Pass: prologepilog
Pass: regalloc
Pass: TTI
I specifically checked remark file for map_update_elem, map_delete_elem and strncpy_from_bpfptr and there is no deadargelim optimization for these three functions.
Therefore, I think there are no optimization remarks for deadargelim pass.
An optimization remark can be added here https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/IPO/DeadArgumentElimination.cpp#L768
BPF is a linux kernel technology which allows to run some custom codes in kernel. Specially, BPF allows to tracing kernel functions given a function signature. For example in [1], in include/linux/hrtimer.h, we have
The following bpf program can be written to trace kernel function hrtimer_start_range_ns():
The function 'hrtimer_start_range_ns' needs to be in kernel /proc/kallsyms so proper kernel address can be found for the to-be-traced function.
But in clang, it seems possible that function signature might be changed during optimization without changing function names. I tried kernel/bpf/syscall.c with additional option '-mllvm -debug-only=deadargelim' which intends to print out some information for DeadArgumentElimination pass. Eventually, I found three function signature changes:
The following are IR before and after DeadArgumentEliminationPass:
Before:
After:
You can see the above function name remains the same in symbol table but actually the number of arguments have changed.
ArgumentPromotion could also change function signature without change function names. This pass is enabled at -O3 or with LTO. Let us say thinLTO is enabled to build the kernel and "-mllvm -debug-only=argpromotion" is added to dump certain transformation info. I hit a case with file tools/lib/subcmd/exec-cmd.c. Before:
After:
gcc has various suffix like .constprop., .part., .isra. to indicate
func signature having changed. For clang, looks like PartialInlining is not enabled by default.
The above ArgumentPromotion is quite similar to .constprop.. I am not sure whether clang
does .isra.-like transformation or not.
If clang optimization changes a function signature, ideally case we could like to have changed signature somewhere (e.g. dwarf) for users easily to do bpf func tracing. But at least we could have a way to encode such signature change information somewhere (dwarf, adding suffix, etc.)?
[1] https://github.com/torvalds/linux [2] https://github.com/torvalds/linux/blob/master/tools/testing/selftests/bpf/progs/test_vmlinux.c#L81-L88