Closed LHolten closed 1 year ago
I have confirmed that marking a function as #[inline(never)]
makes hotspot able to find the correct source file
Have you tried setting debug = 2
? I think that helps with disentangling inlining effects.
Edit: Nevermind, I checked and on current rustc debug=1 is sufficient to get proper profiling info
Hi, that will sadly be a nofix.
The disassembler runs objdump -l -d -C target/release/table_base --start-address 0xbe40 --stop-address 0xcba9
(start and stop address vary per symbol) and uses the output to try to reconstruct the source code.
The output will normally look like this:
0000000000001040 <main>:
main():
/tmp/test.c:4
1040: 55 push %rbp
/tmp/test.c:10
1041: 48 8d 2d bc 0f 00 00 lea 0xfbc(%rip),%rbp # 2004 <_IO_stdin_used+0x4>
1048: 31 f6 xor %esi,%esi
104a: 31 c0 xor %eax,%eax
We then use the first reference and say that this is the start of the function. In your case the compiler inlines a lot of functions, which will result in something like this:
onitama_solver::onitama_simd::update::<impl onitama_solver::onitama_simd::Update>::get_unresolved:
c0a9: 44 89 f8 mov %r15d,%eax
c0ac: 48 03 44 24 28 add 0x28(%rsp),%rax
c0b1: 48 8b 4c 24 38 mov 0x38(%rsp),%rcx
onitama_solver::onitama_simd::accum_spread::<impl onitama_solver::onitama_simd::Accum>::accumulate:
c0b6: 0f bb c1 btc %eax,%ecx
c0b9: 44 0f bb f9 btc %r15d,%ecx
c0bd: 48 8b 54 24 40 mov 0x40(%rsp),%rdx
c0c2: 44 0f b3 fa btr %r15d,%edx
<bit_iter::BitIter<u32> as core::iter::traits::iterator::Iterator>::fold:
/home/lieven/.cargo/registry/src/index.crates.io-6f17d22bba15001f/bit-iter-1.1.1/src/lib.rs:144
c0c6: 41 89 cc mov %ecx,%r12d
c0c9: 45 31 f6 xor %r14d,%r14d
c0cc: 41 09 d4 or %edx,%r12d
c0cf: 74 50 je c121 <onitama_solver::onitama_simd::update::<impl onitama_solver::onitama_simd::Update>::update_layout+0x2e1>
c0d1: 31 f6 xor %esi,%esi
core::num::<impl u32>::trailing_zeros:
c0d3: 66 66 66 66 2e 0f 1f data16 data16 data16 cs nopw 0x0(%rax,%rax,1)
c0da: 84 00 00 00 00 00
/rustc/0ab38e95bb1cbf0bd038d359bdecbfa501f003a7/library/core/src/num/mod.rs:1143
c0e0: f3 45 0f bc d4 tzcnt %r12d,%r10d
As you can see onitama_solver::onitama_simd::accum_spread::<impl onitama_solver::onitama_simd::Accum>::accumulate
does not have a source code reference, which is a result of inlining.
Since the disassembler uses the first reference it finds as the source file, it may choose the wrong file.
agreed, there's nothing we can do about this I'm afraid. if anyone has a suggestion that doesn't require us to disassemble all build artifacts eagerly then I'm all ear. please reopen then. for now, I'm closing this
Btw, I do not know how it works, but samply (another profiler) is able to show the source code with line based sample attribution for these inlined functions.
ok, I think I found a solution. If I comple test.c
#include <stdio.h>
inline int square(int i) {
return i * i;
}
int main() {
for (int i = 0; i < 10000; i++) {
int sum = 0;
for (int j = 0; j < i; j++) {
sum += square(j);
}
printf("Sum: %d\n", sum);
}
}
with -O3 -g
I end up with the following in objdump:
0000000000001040 <main>:
main():
/tmp/test.c:7
1040: 55 push %rbp
/tmp/test.c:13
1041: 48 8d 2d bc 0f 00 00 lea 0xfbc(%rip),%rbp # 2004 <_IO_stdin_used+0x4>
....
square():
/tmp/test.c:4
109b: 66 0f 6f d0 movdqa %xmm0,%xmm2
109f: 66 0f f4 d0 pmuludq %xmm0,%xmm2
10a3: 66 0f 73 d0 20 psrlq $0x20,%xmm0
10a8: 66 0f f4 c0 pmuludq %xmm0,%xmm0
10ac: 66 0f 70 d2 08 pshufd $0x8,%xmm2,%xmm2
10b1: 66 0f 70 c0 08 pshufd $0x8,%xmm0,%xmm0
10b6: 66 0f 62 d0 punpckldq %xmm0,%xmm2
main():
/tmp/test.c:11 (discriminator 1)
10ba: 66 0f fe ca paddd %xmm2,%xmm1
...
dwarfdump gives me:
0x000000f6: DW_TAG_inlined_subroutine
DW_AT_abstract_origin (0x00000166 "square")
DW_AT_entry_pc (0x000000000000109b)
DW_AT_GNU_entry_view (0x01)
DW_AT_ranges (0x0000006f
[0x000000000000109b, 0x00000000000010ba)
[0x00000000000010f0, 0x00000000000010f5))
DW_AT_call_file ("/tmp/test.c")
DW_AT_call_line (11)
DW_AT_call_column (0x20)
0x00000166: DW_TAG_subprogram
DW_AT_external (true)
DW_AT_name ("square")
DW_AT_decl_file ("/tmp/test.c")
DW_AT_decl_line (3)
DW_AT_decl_column (0x0c)
DW_AT_prototyped (true)
DW_AT_type (0x00000054 "int")
DW_AT_inline (DW_INL_declared_inlined)
so it may be worth investing some time into proper dwarf parsing.
Describe the bug The disassembler only shows the entrypoint source file. This could be due to every function getting inlined.
To Reproduce
cargo build --release
/target/release/table_base
with argument4
Expected behavior It should show the selected function source code, instead we always get the same file
/src/bin/table_base.rs
Version Info (please complete the following information):
Additional context Console output:
If necessary I can try to make a smaller reproducer. Thank you for making this software btw!