sharklinux / shark

We're building a better performance management system
http://www.sharkly.io
GNU Lesser General Public License v2.1
79 stars 13 forks source link

samples/perf/threads_running.lua gives me a parse events failed #6

Closed pcn closed 9 years ago

pcn commented 9 years ago

I haven't got much experience in how to describe events. How would I go about troubleshooting this:

$ sudo ./shark samples/perf/threads_running.lua 
parse events [sched:sched_switch] failed!
sharklinux commented 9 years ago

Hi,

Did you mounted debugfs? mount -t debugfs nodev /sys/kernel/debug

pcn commented 9 years ago

Yes indeed:

$ mount | grep debug
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
tracefs on /sys/kernel/debug/tracing type tracefs (rw,relatime)
$ sudo ./shark samples/perf/threads_running.lua 
parse events [sched:sched_switch] failed!
pcn commented 9 years ago

It looks like using the mainline kernel has the downside of not having the perf tools installed as a package. I'm going to keep this open to see if there's a solution that doesn't involve compiling the kernel myself.

sharklinux commented 9 years ago

which kernel version you are using?

pcn commented 9 years ago

I'm interested in learning about the eBPF integration you've created, so I'm using a 4.1.2-rc2 ubuntu mainline build. I'm re-building the kernel to get the tools.

sharklinux commented 9 years ago

Can you run below command in your system?

sudo cat /sys/kernel/debug/tracing/events/sched/sched_switch/format

pcn commented 9 years ago
root@vm:/sys/kernel/debug/tracing/events/sched/sched_switch# cat format
name: sched_switch
ID: 266
format:
    field:unsigned short common_type;   offset:0;   size:2; signed:0;
    field:unsigned char common_flags;   offset:2;   size:1; signed:0;
    field:unsigned char common_preempt_count;   offset:3;   size:1; signed:0;
    field:int common_pid;   offset:4;   size:4; signed:1;

    field:char prev_comm[16];   offset:8;   size:16;    signed:1;
    field:pid_t prev_pid;   offset:24;  size:4; signed:1;
    field:int prev_prio;    offset:28;  size:4; signed:1;
    field:long prev_state;  offset:32;  size:8; signed:1;
    field:char next_comm[16];   offset:40;  size:16;    signed:1;
    field:pid_t next_pid;   offset:56;  size:4; signed:1;
    field:int next_prio;    offset:60;  size:4; signed:1;

print fmt: "prev_comm=%s prev_pid=%d prev_prio=%d prev_state=%s%s ==> next_comm=%s next_pid=%d next_prio=%d", REC->prev_comm, REC->prev_pid, REC->prev_prio, REC->prev_state & (1024-1) ? __print_flags(REC->prev_state & (1024-1), "|", { 1, "S"} , { 2, "D" }, { 4, "T" }, { 8, "t" }, { 16, "Z" }, { 32, "X" }, { 64, "x" }, { 128, "K" }, { 256, "W" }, { 512, "P" }) : "R", REC->prev_state & 1024 ? "+" : "", REC->next_comm, REC->next_pid, REC->next_prio
sharklinux commented 9 years ago

ok, I will try to run mainline in my local box to test it.

sharklinux commented 9 years ago

I reproduced the issue, same as you found. I'm debugging now, will let you know the result soon.

sharklinux commented 9 years ago

Fixed now, please pull again.

The root cause is DEBUGFS_MAGIC changed rencently, make debugfs
check failed in libapikfs.a.

The solution is remove that debugfs magic number check, actually
it don't have much value to check.
pcn commented 9 years ago

Great work, I can confirm that it's working now.