Closed it-klinger closed 2 years ago
The example script from the documentation is different on ARM; the syscalls are named "sys.*" with small letters and not "SyS.*" . When i change it to small letters it's also not working:
$ cat cnt
kprobe:sys_*
{
@syscalls[caller] = count();
}
$ ply cnt
ERR:-22
During debugging i saw that the symbols are taken vom /proc/kallsyms but there are symbols in the file which are not accepted by /sys/kernel/debug/tracing/kprobe_events, e. g.
sys_call_table
Maybe it would be a possible solution to take the symbols from /sys/kernel/debug/tracing/available_filter_functions but I'm not sure if this is a sustainable solution.
Thanks, for submitting. Please try to keep each issue to a single topic so tracking is easier.
Regarding debugfs: I recently implemented a self-test mode in ply (ply -T
) which uses some heuristics to verify a user's setup. It also checks that debugfs is mounted. While I agree that most users will want to mount debugfs at that point, I do not think it is the job of ply to mount filesystems. NOTE: This is still not pushed, I will try to do that tonight.
The syscall tracing examples are what is causing the most issues to be opened on ply by a mile. There are dragons everywhere, and yet it is the first thing (understandably) that everyone tries. Long-term, I want to implement a proper syscall:
provider so that you do not have to keep track of arch-specific stuff to do this.
Regarding wildcards: I was not aware of available_filter_functions
, thank you! That does indeed look like the way forward. kallsyms
is still needed to get the address information to implement offsets properly, but wildcard matches should be filtered through this list.
The self-test is no on master: https://github.com/wkz/ply/commit/e25c9134b856cc7ffe9f562ff95caf9487d16b59
It's not working with me. See output below.
There is the wrong linux version compiled into ply. How can i specify the linux source tree?
# ply -T
Verifying kernel config (/proc/config.gz)... OK
Ensuring that debugfs is mounted... OK
Verifying kprobe... OK
Verifying tracepoint... /bin/sh: line 79: 324 Aborted $PLYBIN 'tracepoint:sched/sched_switch { exit(0); }' 2> /dev/null
ERROR
# ply 'tracepoint:sched/sched_switch { exit(0); }'
ply: provider/tracepoint.c:197: tracepoint_parse: Assertion offs == type_offsetof(t, t->sou.fields[n - 1].name)' failed.
Aborted
# ply -v
ply 2.1.1-14-ge25c913 (linux-version:267168~4.19.160)
# uname -a
Linux bw 5.10.1-rt20-wega-bw #2 PREEMPT_RT Sun Jan 3 19:39:05 CET 2021 armv7l GNU/Linux
Well, the kernel is less picky about the versions matching these days, so that should not be a problem. That said, you should be able to set CPPFLAGS
in the normal way when running configure
if you want.
Not really sure what is happening here. ARM seems to work fine in the CI job: https://github.com/wkz/ply/runs/1661513654
Could you paste the contents of /sys/kernel/debug/tracing/events/sched/sched_switch/format
on your system?
# cat /sys/kernel/debug/tracing/events/sched/sched_switch/format
name: sched_switch
ID: 233
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:unsigned char common_migrate_disable; offset:8; size:1; signed:0;
field:unsigned char common_preempt_lazy_count; offset:9; size:1; signed:0;
field:char prev_comm[16]; offset:12; size:16; signed:0;
field:pid_t prev_pid; offset:28; size:4; signed:1;
field:int prev_prio; offset:32; size:4; signed:1;
field:long prev_state; offset:36; size:4; signed:1;
field:char next_comm[16]; offset:40; size:16; signed:0;
field:pid_t next_pid; offset:56; size:4; signed:1;
field:int next_prio; offset:60; size:4; signed:1;
print fmt: "prev_comm=%s prev_pid=%d prev_prio=%d prev_state=%s%s ==> next_comm=%s next_pid=%d next_prio=%d", REC->prev_comm, REC->prev_pid, REC->prev_prio, (REC->prev_state & ((((0x0000 | 0x0001 | 0x0002 | 0x0004 | 0x0008 | 0x0010 | 0x0020 | 0x0040) + 1) << 1) - 1)) ? __print_flags(REC->prev_state & ((((0x0000 | 0x0001 | 0x0002 | 0x0004 | 0x0008 | 0x0010 | 0x0020 | 0x0040) + 1) << 1) - 1), "|", { 0x0001, "S" }, { 0x0002, "D" }, { 0x0004, "T" }, { 0x0008, "t" }, { 0x0010, "X" }, { 0x0020, "Z" }, { 0x0040, "P" }, { 0x0080, "I" }) : "R", REC->prev_state & (((0x0000 | 0x0001 | 0x0002 | 0x0004 | 0x0008 | 0x0010 | 0x0020 | 0x0040) + 1) << 1) ? "+" : "", REC->next_comm, REC->next_pid, REC->next_prio
Now, I fixed the wrong kernel header version so that's equal to the version of the running kernel. But this was not the real problem.
I'm using the rt-preemption-patch and the sched_switch event is different with versus without rt-patch. See below the output when booting linux without rt-patch in comparison to the last comment with rt-patch.
So the consequence is to build against the same, patched kernel headers when using kernel patches.
# ply -v
ply (linux-version:330241~5.10.1)
# uname -a
Linux bw 5.10.1-wega-bw #1 Fri Jan 8 20:17:29 CET 2021 armv7l GNU/Linux
# ply -T
Verifying kernel config (/proc/config.gz)... OK
Ensuring that debugfs is mounted... OK
Verifying kprobe... OK
Verifying tracepoint... OK
# cat /sys/kernel/debug/tracing/events/sched/sched_switch/format
name: sched_switch
ID: 233
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:char prev_comm[16]; offset:8; size:16; signed:0;
field:pid_t prev_pid; offset:24; size:4; signed:1;
field:int prev_prio; offset:28; size:4; signed:1;
field:long prev_state; offset:32; size:4; signed:1;
field:char next_comm[16]; offset:36; size:16; signed:0;
field:pid_t next_pid; offset:52; size:4; signed:1;
field:int next_prio; offset:56; size:4; signed:1;
print fmt: "prev_comm=%s prev_pid=%d prev_prio=%d prev_state=%s%s ==> next_comm=%s next_pid=%d next_prio=%d", REC->prev_comm, REC->prev_pid, REC->prev_prio, (REC->prev_state & ((((0x0000 | 0x0001 | 0x0002 | 0x0004 | 0x0008 | 0x0010 | 0x0020 | 0x0040) + 1) << 1) - 1)) ? __print_flags(REC->prev_state & ((((0x0000 | 0x0001 | 0x0002 | 0x0004 | 0x0008 | 0x0010 | 0x0020 | 0x0040) + 1) << 1) - 1), "|", { 0x0001, "S" }, { 0x0002, "D" }, { 0x0004, "T" }, { 0x0008, "t" }, { 0x0010, "X" }, { 0x0020, "Z" }, { 0x0040, "P" }, { 0x0080, "I" }) : "R", REC->prev_state & (((0x0000 | 0x0001 | 0x0002 | 0x0004 | 0x0008 | 0x0010 | 0x0020 | 0x0040) + 1) << 1) ? "+" : "", REC->next_comm, REC->next_pid, REC->next_prio
Interesting. ply
aligns fields as though all members are laid out sequentially. But it seems like the kernel treats the common fields as a separate struct (and therefore aligns prev_comm
on a 4 byte boundary).
This shines a light on a major deficiency in ply
s type system. Unfortunately this is not a quick fix. Once I get around to that refactor, I will make sure to fix this as well.
Fixed in 6e25f69
When starting ply without a debugfs mounted at /sys/kernel/debug there's an error:
$ ply <file>
info: creating kallsyms cache
ERR:-2
My proposal is to check for a mounted debugfs and automatically mount it if not present. I can also implement this.