Closed shoffmeister closed 2 years ago
It seems as if there are two issues at hand: the root cause of my problems seems to be perf doing a SIGSEGV stunt.
The defect in magic-trace is that it (most probably) is not robust against perf
dying, hence the (by far too late) trigger on the monitor waiting for a perf
process which is already dead.
I'll try to get a handle on why perf
dies.
Sun 2022-05-22 23:09:00 CEST 21753 0 0 SIGSEGV present /usr/bin/perf 700.3K
Sun 2022-05-22 23:10:49 CEST 21914 0 0 SIGSEGV present /usr/bin/perf 699.6K
Sun 2022-05-22 23:12:06 CEST 21989 0 0 SIGSEGV present /usr/bin/perf 701.7K
FWIW, the problem in my perf
is https://lore.kernel.org/linux-perf-users/f0add43b-3de5-20c5-22c4-70aff4af959f@scylladb.com/
Essentially this is Fedora 36 build strategy and the kernel tools implementation causing an infinite recursion inside perf.
Still, making magic-trace more robust against failure of perf
would be nice. :)
And https://github.com/torvalds/linux/commit/0ae065a5d265bc5ada13e350015458e0c5e5c351 seems to fix that just in time for upstream kernel 5.18.
Thanks for this report, it was a fun read. I agree that we should have a better message if perf dies unexpectedly, I'll look into it.
Running
./magic-trace attach -pid 2463
, where 2463 is a peculiar process, then pressing Ctrl+C yields a fatal errorIn my case, I am trying to trace pid=2463 which happens to be the top-level (X11) Xorg process of a Linux deskop environment with
sddm
being the session manager, i.e.perf --version
perf version 5.17.6
FWIW, .
/magic-trace version
straight off github:cat /proc/cpuinfo
11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz
(That's an 8 core mobile CPU, Tiger Lake)
I am really trying to to trace that very process in order to discover more about its (IMHO) abnormal (and unwanted) CPU usage. My hope would be that this yields some tell-tale symbols on the stack, to continue my investigation.
Alas, ... dead.