jonhoo / inferno

A Rust port of FlameGraph
Other
1.64k stars 117 forks source link

flamegraph does not contain sys_enter_* calls with params #308

Open SergejIsbrecht opened 9 months ago

SergejIsbrecht commented 9 months ago

Version

Inferno 0.11.18

Issue

When using Inferno to process sampled perf script output, some stacktraces a missing in a generated flamegraph.svg

perf record -F499 -e 'syscalls:sys_enter_*' --call-graph dwarf -- command
perf script > out

The generated flamegraph only contains sys calls without any parameters. For example:

File: out

XXXXXX XXXXX [005] 44427.957866: syscalls:sys_enter_sched_yield: 
          xxxxxxxxxx __sched_yield (/gpl/lib/libc)
          ...
          xxxxxxxxxx [unknown] (/gpl/lib/libpthread)
          xxxxxxxxxx [unknown] (/gpl/lib/libc)
XXXX XXXX [005] 44422.848908: syscalls:sys_enter_getuid: 
          XXXXXXXXXX getuid (/gpl/lib/libc)
...
          XXXXXXXXXX __libc_start_main (/gpl/lib/libc)
          XXXXXXXXXX _start (/opt/hmi/native_image/command)

This calls are visible in the flamegraph. On the contrary following entries are missing in the flamegraph.

XXXXXXXXX XXXXX [001] 44422.868561: syscalls:sys_enter_munmap: addr: 0xXXXXX, len: 0xXXXXX
          XXXXXXX __munmap (/gpl/lib/libc)
          ....
          XXXXXXX [unknown] (/gpl/lib/libpthread)
          XXXXXXX [unknown] (/gpl/lib/libc)

Output: inferno-collapse-perf

perf script

XXXXXXXXXXXX XXXXX [003] 44422.850672: syscalls:sys_enter_sched_getaffinity: pid: 0xXXXXXXXXXX, len: 0xXXXXXXXXXX, user_mask_ptr: 0xXXXXXXXXXX
          XXXXXXXXXX pthread_getaffinity_np (/gpl/lib/libpthread)
          XXXXXXXXXX pthread_getattr_np (/gpl/lib/libpthread)
          ....
          XXXXXXXXXX [unknown] (/gpl/lib/libpthread)
          XXXXXXXXXX [unknown] (/gpl/lib/libc)

inferno-collapse-perf

[WARN  inferno::collapse::perf::logging] Weird stack line: pid: 0xXXXXXXXXXX, len: 0xXXXXXXXXXX, user_mask_ptr: 0xXXXXXXXXXX
[WARN  inferno::collapse::perf::logging] Weird event line:        XXXXXXXXXX pthread_getaffinity_np (/gpl/lib/libpthread)
[WARN  inferno::collapse::perf::logging] Weird event line:        XXXXXXXXXX pthread_getattr_np (/gpl/lib/libpthread)
...
[WARN  inferno::collapse::perf::logging] Weird event line:        XXXXXXXXXX [unknown] (/gpl/lib/libpthread)
[WARN  inferno::collapse::perf::logging] Weird event line:        XXXXXXXXXX [unknown] (/gpl/lib/libc)

Perf

perf version 4.4

FlameGraph

Seems like FlameGraph by Brendan Gregg is able to pars perf script properly

SergejIsbrecht commented 9 months ago

Example

perf 19953 [001] 33076.615333:                   syscalls:sys_enter_ioctl: fd: 0x0000040e, cmd: 0x00002400, arg: 0x00000000
              112d8f [unknown] (/usr/lib/x86_64-linux-gnu/libc.so.6)
              30bc56 __evlist__enable+0x96 (inlined)
              26c177 __cmd_record+0x24b7 (inlined)
    ffffffffffffffff [unknown] ([unknown])

pipewire-pulse  2110 [000] 33076.615351:                syscalls:sys_enter_recvfrom: fd: 0x0000001a, ubuf: 0x56449a86f51c, size: 0x00000014, flags: 0x00000040, addr: 0x00000000, addr_len: 0x00000000
              12063c __libc_recv+0x1ec (inlined)
              12063c __libc_recv+0x1ec (inlined)

pipewire-pulse  2110 [000] 33076.615356:              syscalls:sys_enter_epoll_wait: epfd: 0x00000004, events: 0x7ffd9af1c170, maxevents: 0x00000020, timeout: 0xffffffff
              11eb66 epoll_wait+0x106 (inlined)
                8a22 [unknown] (/usr/lib/x86_64-linux-gnu/spa-0.2/support/libspa-support.so)
        56449a7bb8df [unknown] ([heap])
jonhoo commented 9 months ago

Oh interesting! This suggests we have a corner-case to handle in https://github.com/jonhoo/inferno/blob/86131d445bbe338c0b178933e1fa93393644fc05/src/collapse/perf.rs#L377-L421

I suspect it's actually not the arguments that are tripping us up here, but rather the : in the event name (syscalls:sys_enter_ioctl). event ends up getting set to syscalls here https://github.com/jonhoo/inferno/blob/86131d445bbe338c0b178933e1fa93393644fc05/src/collapse/perf.rs#L377-L380

and then the remainder (sys_enter_epoll and the arguments) end up in post_event here https://github.com/jonhoo/inferno/blob/86131d445bbe338c0b178933e1fa93393644fc05/src/collapse/perf.rs#L396-L408

I'm somewhat short on time these days, so waiting for me to take a look at fixing this would probably take some time, but if you want to take a stab at adding a test case, tracing through the code here to figure out where it gets confused, and then submit a PR with a fix, I'd be happy to try and help guide you through!

SergejIsbrecht commented 9 months ago

I will give it a go, but I write Java most of the time. I have already been looking into the code base, but reading Rust is quite hard to me.

Anyhow, I think it should be easy to fix it after one is familiar with the code base.