iovisor / bcc

BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more
Apache License 2.0
20.3k stars 3.85k forks source link

opensnoop on arm64 fails because no open syscall #3344

Closed martinetd closed 3 years ago

martinetd commented 3 years ago

(that is true for both libbpf-tools/opensnoop.bpf.c, the tools/opensnoop.py and bpftrace's opensnoop.bt)

I'm surprised I don't find any issue about this, for example the libbpf-tool variant fails at runtime like this:

libbpf: failed to determine tracepoint 'syscalls/sys_enter_open' perf event ID: No such file or directory
libbpf: prog 'tracepoint__syscalls__sys_enter_open': failed to create tracepoint 'syscalls/sys_enter_open' perf event: No such file or directory
libbpf: failed to auto-attach program 'tracepoint__syscalls__sys_enter_open': -2
failed to attach BPF programs

Is there a way to automatically detect if a tracepoint is available? I'm mostly only interested in the libbpf-tools variant at this point. If possible it would be great to catch this kind of problems at compile time, even if it might not work on another kernel if we're able to ensure the probes exist through some generated header file at compile time (optionally?) that would help catch typos or such.

anakryiko commented 3 years ago

It's possible to detect by checking special files(s) in debugfs. But once https://lore.kernel.org/bpf/20210322180441.1364511-1-rafaeldtinoco@ubuntu.com/ makes it into libbpf, libbpf will be able to fall back to legacy kprobes automatically. As for tracepoints, there should be some fallback as well, I need to double-check if anything special should happen there.

martinetd commented 3 years ago

Well, rather than manually checking through debugfs, just replacing bpf_object__attach_skeleton() to run through probes and skip -ENOENT errors on syscalls that are known to not always be present would be fine, I guess. It probably even wouldn't be too hard to decorate the bpf code so that bpftool gen skeleton could mark some probes as optional if explicitly requested?

I've just had a look and don't see any fallback code for that in libbpf, but I'm not sure what kind of fallback you are talking about. In this case the syscall just isn't there so there on this arch so be if tracepoint or kprobe or anything else there won't be anything to attach to.

anakryiko commented 3 years ago

what kernel version are you running? I suspect it just doesn't support perf_event-based kprobe/tracepoint and needs a different way to attach to kprobe/tracepoint. Or your kernel is not compiled with tracepoint support?.. Don't know. Fallback I'm referring to is the legacy kprobe being implemented in that referenced patch.

As for just silently ignoring -ENOENT errors from skeleton, that's too fragile and error-prone.

martinetd commented 3 years ago

what kernel version are you running? I suspect it just doesn't support perf_event-based kprobe/tracepoint and needs a different way to attach to kprobe/tracepoint. Or your kernel is not compiled with tracepoint support?.. Don't know. Fallback I'm referring to is the legacy kprobe being implemented in that referenced patch.

I tried recently on 5.12-rc2 ish so that's just about as bleeding edge as it can get. It could be I'm missing something in .config though but I'll need help figure what's missing as the most obvious ones look good to me (PERF_EVENTS, KPROBES, FTRACE, FUNCTION_TRACER, FTRACE_SYSCALLS, KPROBE_EVENTS... if anything I didn't have DYNAMIC_FTRACES but KPROBES_ON_FTRACE requires HAVE_KPROBES_ON_FTRACE which isn't set for arm64 so that one isn't available)

Is there some way to check if a fallback is used from a succesful probe (e.g. openat; verbose mode output?) or test e.g. if perf trace works I'm good?

As for just silently ignoring -ENOENT errors from skeleton, that's too fragile and error-prone.

Well, I think we need a way to flag some syscalls as optionals so it's ok if they're missing -- I'm not sure why you're expecting the libbpf program to "know" that there is no open syscall on arm64. For example I'd expect the probe to fail to attach if I make a typo and request to attach on "sys_enter_opneat"; so why would the lack of tracepoint/syscalls/sys_enter_open be OK for arm64?

martinetd commented 3 years ago

fwiw here's logs of the opensnoop libbpf program with -v after removing hooks to open:

libbpf: loading object 'opensnoop_bpf' from buffer
libbpf: elf: section(2) tracepoint/syscalls/sys_enter_openat, size 376, link 0, flags 6, type=1
libbpf: sec 'tracepoint/syscalls/sys_enter_openat': found program 'tracepoint__syscalls__sys_enter_openat' at insn offset 0 (0 bytes), code size 47 insns (376 bytes)
libbpf: elf: section(3) tracepoint/syscalls/sys_exit_openat, size 720, link 0, flags 6, type=1
libbpf: sec 'tracepoint/syscalls/sys_exit_openat': found program 'tracepoint__syscalls__sys_exit_openat' at insn offset 0 (0 bytes), code size 90 insns (720 bytes)
libbpf: elf: section(4) .rodata, size 21, link 0, flags 2, type=1
libbpf: elf: section(5) .maps, size 56, link 0, flags 3, type=1
libbpf: elf: section(6) license, size 4, link 0, flags 3, type=1
libbpf: license of opensnoop_bpf is GPL
libbpf: elf: section(7) .BTF, size 3040, link 0, flags 0, type=1
libbpf: elf: section(8) .BTF.ext, size 1036, link 0, flags 0, type=1
libbpf: elf: section(9) .symtab, size 480, link 15, flags 0, type=2
libbpf: elf: section(10) .reltracepoint/syscalls/sys_enter_openat, size 64, link 9, flags 0, type=9
libbpf: elf: section(11) .reltracepoint/syscalls/sys_exit_openat, size 64, link 9, flags 0, type=9
libbpf: looking for externs among 20 symbols...
libbpf: collected 0 externs total
libbpf: map 'start': at sec_idx 5, offset 0.
libbpf: map 'start': found type = 1.
libbpf: map 'start': found max_entries = 10240.
libbpf: map 'start': found key [8], sz = 4.
libbpf: map 'start': found value [12], sz = 16.
libbpf: map 'events': at sec_idx 5, offset 32.
libbpf: map 'events': found type = 4.
libbpf: map 'events': found key_size = 4.
libbpf: map 'events': found value_size = 4.
libbpf: map 'opensnoo.rodata' (global data): at sec_idx 4, offset 0, flags 480.
libbpf: map 2 is "opensnoo.rodata"
libbpf: sec '.reltracepoint/syscalls/sys_enter_openat': collecting relocation for section(2) 'tracepoint/syscalls/sys_enter_openat'
libbpf: sec '.reltracepoint/syscalls/sys_enter_openat': relo #0: insn #3 against 'targ_tgid'
libbpf: prog 'tracepoint__syscalls__sys_enter_openat': found data map 2 (opensnoo.rodata, sec 4, off 0) for insn 3
libbpf: sec '.reltracepoint/syscalls/sys_enter_openat': relo #1: insn #11 against 'targ_pid'
libbpf: prog 'tracepoint__syscalls__sys_enter_openat': found data map 2 (opensnoo.rodata, sec 4, off 0) for insn 11
libbpf: sec '.reltracepoint/syscalls/sys_enter_openat': relo #2: insn #19 against 'targ_uid'
libbpf: prog 'tracepoint__syscalls__sys_enter_openat': found data map 2 (opensnoo.rodata, sec 4, off 0) for insn 19
libbpf: sec '.reltracepoint/syscalls/sys_enter_openat': relo #3: insn #41 against 'start'
libbpf: prog 'tracepoint__syscalls__sys_enter_openat': found map 0 (start, sec 5, off 0) for insn #41
libbpf: sec '.reltracepoint/syscalls/sys_exit_openat': collecting relocation for section(3) 'tracepoint/syscalls/sys_exit_openat'
libbpf: sec '.reltracepoint/syscalls/sys_exit_openat': relo #0: insn #43 against 'start'
libbpf: prog 'tracepoint__syscalls__sys_exit_openat': found map 0 (start, sec 5, off 0) for insn #43
libbpf: sec '.reltracepoint/syscalls/sys_exit_openat': relo #1: insn #49 against 'targ_failed'
libbpf: prog 'tracepoint__syscalls__sys_exit_openat': found data map 2 (opensnoo.rodata, sec 4, off 0) for insn 49
libbpf: sec '.reltracepoint/syscalls/sys_exit_openat': relo #2: insn #77 against 'events'
libbpf: prog 'tracepoint__syscalls__sys_exit_openat': found map 1 (events, sec 5, off 32) for insn #77
libbpf: sec '.reltracepoint/syscalls/sys_exit_openat': relo #3: insn #85 against 'start'
libbpf: prog 'tracepoint__syscalls__sys_exit_openat': found map 0 (start, sec 5, off 0) for insn #85
libbpf: loading kernel BTF '/sys/kernel/btf/vmlinux': 0
libbpf: map 'start': created successfully, fd=4
libbpf: map 'events': setting size to 4
libbpf: map 'events': created successfully, fd=5
libbpf: map 'opensnoo.rodata': created successfully, fd=6
libbpf: sec 'tracepoint/syscalls/sys_enter_openat': found 2 CO-RE relocations
libbpf: prog 'tracepoint__syscalls__sys_enter_openat': relo #0: kind <byte_off> (0), spec is [23] struct trace_event_raw_sys_enter.args[1] (0:2:1 @ offset 24)
libbpf: CO-RE relocating [0] struct trace_event_raw_sys_enter: found target candidate [3857] struct trace_event_raw_sys_enter in [vmlinux]
libbpf: prog 'tracepoint__syscalls__sys_enter_openat': relo #0: matching candidate #0 [3857] struct trace_event_raw_sys_enter.args[1] (0:2:1 @ offset 24)
libbpf: prog 'tracepoint__syscalls__sys_enter_openat': relo #0: patched insn #33 (LDX/ST/STX) off 24 -> 24
libbpf: prog 'tracepoint__syscalls__sys_enter_openat': relo #1: kind <byte_off> (0), spec is [23] struct trace_event_raw_sys_enter.args[2] (0:2:2 @ offset 32)
libbpf: prog 'tracepoint__syscalls__sys_enter_openat': relo #1: matching candidate #0 [3857] struct trace_event_raw_sys_enter.args[2] (0:2:2 @ offset 32)
libbpf: prog 'tracepoint__syscalls__sys_enter_openat': relo #1: patched insn #35 (LDX/ST/STX) off 32 -> 32
libbpf: sec 'tracepoint/syscalls/sys_exit_openat': found 1 CO-RE relocations
libbpf: prog 'tracepoint__syscalls__sys_exit_openat': relo #0: kind <byte_off> (0), spec is [34] struct trace_event_raw_sys_exit.ret (0:2 @ offset 16)
libbpf: CO-RE relocating [0] struct trace_event_raw_sys_exit: found target candidate [3858] struct trace_event_raw_sys_exit in [vmlinux]
libbpf: prog 'tracepoint__syscalls__sys_exit_openat': relo #0: matching candidate #0 [3858] struct trace_event_raw_sys_exit.ret (0:2 @ offset 16)
libbpf: prog 'tracepoint__syscalls__sys_exit_openat': relo #0: patched insn #48 (LDX/ST/STX) off 16 -> 16

It looks like tracepoints work as expected to me, without fallback. (the program does trace opens as expected with just openat)

For comparison here's the verbose log of the original version; which just fails finding tracepoints for sys_enter_open as it does not exist as expected:

libbpf: loading object 'opensnoop_bpf' from buffer
libbpf: elf: section(2) tracepoint/syscalls/sys_enter_open, size 376, link 0, flags 6, type=1
libbpf: sec 'tracepoint/syscalls/sys_enter_open': found program 'tracepoint__syscalls__sys_enter_open' at insn offset 0 (0 bytes), code size 47 insns (376 bytes)
libbpf: elf: section(3) tracepoint/syscalls/sys_enter_openat, size 376, link 0, flags 6, type=1
libbpf: sec 'tracepoint/syscalls/sys_enter_openat': found program 'tracepoint__syscalls__sys_enter_openat' at insn offset 0 (0 bytes), code size 47 insns (376 bytes)
libbpf: elf: section(4) tracepoint/syscalls/sys_exit_open, size 720, link 0, flags 6, type=1
libbpf: sec 'tracepoint/syscalls/sys_exit_open': found program 'tracepoint__syscalls__sys_exit_open' at insn offset 0 (0 bytes), code size 90 insns (720 bytes)
libbpf: elf: section(5) tracepoint/syscalls/sys_exit_openat, size 720, link 0, flags 6, type=1
libbpf: sec 'tracepoint/syscalls/sys_exit_openat': found program 'tracepoint__syscalls__sys_exit_openat' at insn offset 0 (0 bytes), code size 90 insns (720 bytes)
libbpf: elf: section(6) .rodata, size 21, link 0, flags 2, type=1
libbpf: elf: section(7) .maps, size 56, link 0, flags 3, type=1
libbpf: elf: section(8) license, size 4, link 0, flags 3, type=1
libbpf: license of opensnoop_bpf is GPL
libbpf: elf: section(9) .BTF, size 3487, link 0, flags 0, type=1
libbpf: elf: section(10) .BTF.ext, size 2028, link 0, flags 0, type=1
libbpf: elf: section(11) .symtab, size 744, link 19, flags 0, type=2
libbpf: elf: section(12) .reltracepoint/syscalls/sys_enter_open, size 64, link 11, flags 0, type=9
libbpf: elf: section(13) .reltracepoint/syscalls/sys_enter_openat, size 64, link 11, flags 0, type=9
libbpf: elf: section(14) .reltracepoint/syscalls/sys_exit_open, size 64, link 11, flags 0, type=9
libbpf: elf: section(15) .reltracepoint/syscalls/sys_exit_openat, size 64, link 11, flags 0, type=9
libbpf: looking for externs among 31 symbols...
libbpf: collected 0 externs total
libbpf: map 'start': at sec_idx 7, offset 0.
libbpf: map 'start': found type = 1.
libbpf: map 'start': found max_entries = 10240.
libbpf: map 'start': found key [8], sz = 4.
libbpf: map 'start': found value [12], sz = 16.
libbpf: map 'events': at sec_idx 7, offset 32.
libbpf: map 'events': found type = 4.
libbpf: map 'events': found key_size = 4.
libbpf: map 'events': found value_size = 4.
libbpf: map 'opensnoo.rodata' (global data): at sec_idx 6, offset 0, flags 480.
libbpf: map 2 is "opensnoo.rodata"
libbpf: sec '.reltracepoint/syscalls/sys_enter_open': collecting relocation for section(2) 'tracepoint/syscalls/sys_enter_open'
libbpf: sec '.reltracepoint/syscalls/sys_enter_open': relo #0: insn #3 against 'targ_tgid'
libbpf: prog 'tracepoint__syscalls__sys_enter_open': found data map 2 (opensnoo.rodata, sec 6, off 0) for insn 3
libbpf: sec '.reltracepoint/syscalls/sys_enter_open': relo #1: insn #11 against 'targ_pid'
libbpf: prog 'tracepoint__syscalls__sys_enter_open': found data map 2 (opensnoo.rodata, sec 6, off 0) for insn 11
libbpf: sec '.reltracepoint/syscalls/sys_enter_open': relo #2: insn #19 against 'targ_uid'
libbpf: prog 'tracepoint__syscalls__sys_enter_open': found data map 2 (opensnoo.rodata, sec 6, off 0) for insn 19
libbpf: sec '.reltracepoint/syscalls/sys_enter_open': relo #3: insn #41 against 'start'
libbpf: prog 'tracepoint__syscalls__sys_enter_open': found map 0 (start, sec 7, off 0) for insn #41
libbpf: sec '.reltracepoint/syscalls/sys_enter_openat': collecting relocation for section(3) 'tracepoint/syscalls/sys_enter_openat'
libbpf: sec '.reltracepoint/syscalls/sys_enter_openat': relo #0: insn #3 against 'targ_tgid'
libbpf: prog 'tracepoint__syscalls__sys_enter_openat': found data map 2 (opensnoo.rodata, sec 6, off 0) for insn 3
libbpf: sec '.reltracepoint/syscalls/sys_enter_openat': relo #1: insn #11 against 'targ_pid'
libbpf: prog 'tracepoint__syscalls__sys_enter_openat': found data map 2 (opensnoo.rodata, sec 6, off 0) for insn 11
libbpf: sec '.reltracepoint/syscalls/sys_enter_openat': relo #2: insn #19 against 'targ_uid'
libbpf: prog 'tracepoint__syscalls__sys_enter_openat': found data map 2 (opensnoo.rodata, sec 6, off 0) for insn 19
libbpf: sec '.reltracepoint/syscalls/sys_enter_openat': relo #3: insn #41 against 'start'
libbpf: prog 'tracepoint__syscalls__sys_enter_openat': found map 0 (start, sec 7, off 0) for insn #41
libbpf: sec '.reltracepoint/syscalls/sys_exit_open': collecting relocation for section(4) 'tracepoint/syscalls/sys_exit_open'
libbpf: sec '.reltracepoint/syscalls/sys_exit_open': relo #0: insn #43 against 'start'
libbpf: prog 'tracepoint__syscalls__sys_exit_open': found map 0 (start, sec 7, off 0) for insn #43
libbpf: sec '.reltracepoint/syscalls/sys_exit_open': relo #1: insn #49 against 'targ_failed'
libbpf: prog 'tracepoint__syscalls__sys_exit_open': found data map 2 (opensnoo.rodata, sec 6, off 0) for insn 49
libbpf: sec '.reltracepoint/syscalls/sys_exit_open': relo #2: insn #77 against 'events'
libbpf: prog 'tracepoint__syscalls__sys_exit_open': found map 1 (events, sec 7, off 32) for insn #77
libbpf: sec '.reltracepoint/syscalls/sys_exit_open': relo #3: insn #85 against 'start'
libbpf: prog 'tracepoint__syscalls__sys_exit_open': found map 0 (start, sec 7, off 0) for insn #85
libbpf: sec '.reltracepoint/syscalls/sys_exit_openat': collecting relocation for section(5) 'tracepoint/syscalls/sys_exit_openat'
libbpf: sec '.reltracepoint/syscalls/sys_exit_openat': relo #0: insn #43 against 'start'
libbpf: prog 'tracepoint__syscalls__sys_exit_openat': found map 0 (start, sec 7, off 0) for insn #43
libbpf: sec '.reltracepoint/syscalls/sys_exit_openat': relo #1: insn #49 against 'targ_failed'
libbpf: prog 'tracepoint__syscalls__sys_exit_openat': found data map 2 (opensnoo.rodata, sec 6, off 0) for insn 49
libbpf: sec '.reltracepoint/syscalls/sys_exit_openat': relo #2: insn #77 against 'events'
libbpf: prog 'tracepoint__syscalls__sys_exit_openat': found map 1 (events, sec 7, off 32) for insn #77
libbpf: sec '.reltracepoint/syscalls/sys_exit_openat': relo #3: insn #85 against 'start'
libbpf: prog 'tracepoint__syscalls__sys_exit_openat': found map 0 (start, sec 7, off 0) for insn #85
libbpf: loading kernel BTF '/sys/kernel/btf/vmlinux': 0
libbpf: map 'start': created successfully, fd=4
libbpf: map 'events': setting size to 4
libbpf: map 'events': created successfully, fd=5
libbpf: map 'opensnoo.rodata': created successfully, fd=6
libbpf: sec 'tracepoint/syscalls/sys_enter_open': found 2 CO-RE relocations
libbpf: prog 'tracepoint__syscalls__sys_enter_open': relo #0: kind <byte_off> (0), spec is [23] struct trace_event_raw_sys_enter.args[0] (0:2:0 @ offset 16)
libbpf: CO-RE relocating [0] struct trace_event_raw_sys_enter: found target candidate [3857] struct trace_event_raw_sys_enter in [vmlinux]
libbpf: prog 'tracepoint__syscalls__sys_enter_open': relo #0: matching candidate #0 [3857] struct trace_event_raw_sys_enter.args[0] (0:2:0 @ offset 16)
libbpf: prog 'tracepoint__syscalls__sys_enter_open': relo #0: patched insn #33 (LDX/ST/STX) off 16 -> 16
libbpf: prog 'tracepoint__syscalls__sys_enter_open': relo #1: kind <byte_off> (0), spec is [23] struct trace_event_raw_sys_enter.args[1] (0:2:1 @ offset 24)
libbpf: prog 'tracepoint__syscalls__sys_enter_open': relo #1: matching candidate #0 [3857] struct trace_event_raw_sys_enter.args[1] (0:2:1 @ offset 24)
libbpf: prog 'tracepoint__syscalls__sys_enter_open': relo #1: patched insn #35 (LDX/ST/STX) off 24 -> 24
libbpf: sec 'tracepoint/syscalls/sys_enter_openat': found 2 CO-RE relocations
libbpf: prog 'tracepoint__syscalls__sys_enter_openat': relo #0: kind <byte_off> (0), spec is [23] struct trace_event_raw_sys_enter.args[1] (0:2:1 @ offset 24)
libbpf: prog 'tracepoint__syscalls__sys_enter_openat': relo #0: matching candidate #0 [3857] struct trace_event_raw_sys_enter.args[1] (0:2:1 @ offset 24)
libbpf: prog 'tracepoint__syscalls__sys_enter_openat': relo #0: patched insn #33 (LDX/ST/STX) off 24 -> 24
libbpf: prog 'tracepoint__syscalls__sys_enter_openat': relo #1: kind <byte_off> (0), spec is [23] struct trace_event_raw_sys_enter.args[2] (0:2:2 @ offset 32)
libbpf: prog 'tracepoint__syscalls__sys_enter_openat': relo #1: matching candidate #0 [3857] struct trace_event_raw_sys_enter.args[2] (0:2:2 @ offset 32)
libbpf: prog 'tracepoint__syscalls__sys_enter_openat': relo #1: patched insn #35 (LDX/ST/STX) off 32 -> 32
libbpf: sec 'tracepoint/syscalls/sys_exit_open': found 1 CO-RE relocations
libbpf: prog 'tracepoint__syscalls__sys_exit_open': relo #0: kind <byte_off> (0), spec is [36] struct trace_event_raw_sys_exit.ret (0:2 @ offset 16)
libbpf: CO-RE relocating [0] struct trace_event_raw_sys_exit: found target candidate [3858] struct trace_event_raw_sys_exit in [vmlinux]
libbpf: prog 'tracepoint__syscalls__sys_exit_open': relo #0: matching candidate #0 [3858] struct trace_event_raw_sys_exit.ret (0:2 @ offset 16)
libbpf: prog 'tracepoint__syscalls__sys_exit_open': relo #0: patched insn #48 (LDX/ST/STX) off 16 -> 16
libbpf: sec 'tracepoint/syscalls/sys_exit_openat': found 1 CO-RE relocations
libbpf: prog 'tracepoint__syscalls__sys_exit_openat': relo #0: kind <byte_off> (0), spec is [36] struct trace_event_raw_sys_exit.ret (0:2 @ offset 16)
libbpf: prog 'tracepoint__syscalls__sys_exit_openat': relo #0: matching candidate #0 [3858] struct trace_event_raw_sys_exit.ret (0:2 @ offset 16)
libbpf: prog 'tracepoint__syscalls__sys_exit_openat': relo #0: patched insn #48 (LDX/ST/STX) off 16 -> 16
libbpf: failed to open '/sys/kernel/debug/tracing/events/syscalls/sys_enter_open/id': No such file or directory
libbpf: failed to determine tracepoint 'syscalls/sys_enter_open' perf event ID: No such file or directory
libbpf: prog 'tracepoint__syscalls__sys_enter_open': failed to create tracepoint 'syscalls/sys_enter_open' perf event: No such file or directory
libbpf: failed to auto-attach program 'tracepoint__syscalls__sys_enter_open': -2
failed to attach BPF programs
anakryiko commented 3 years ago

what kernel version are you running? I suspect it just doesn't support perf_event-based kprobe/tracepoint and needs a different way to attach to kprobe/tracepoint. Or your kernel is not compiled with tracepoint support?.. Don't know. Fallback I'm referring to is the legacy kprobe being implemented in that referenced patch.

I tried recently on 5.12-rc2 ish so that's just about as bleeding edge as it can get. It could be I'm missing something in .config though but I'll need help figure what's missing as the most obvious ones look good to me (PERF_EVENTS, KPROBES, FTRACE, FUNCTION_TRACER, FTRACE_SYSCALLS, KPROBE_EVENTS... if anything I didn't have DYNAMIC_FTRACES but KPROBES_ON_FTRACE requires HAVE_KPROBES_ON_FTRACE which isn't set for arm64 so that one isn't available)

FWIW, here's snippets from my config:

$ rg TRACEPOINT ~/linux-build/default/.config
273:CONFIG_TRACEPOINTS=y
3459:# CONFIG_DRM_I915_LOW_LEVEL_TRACEPOINTS is not set
5461:CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
5503:# CONFIG_TRACEPOINT_BENCHMARK is not set
$ rg FTRACE ~/linux-build/default/.config
677:CONFIG_KPROBES_ON_FTRACE=y
687:CONFIG_HAVE_KPROBES_ON_FTRACE=y
5456:CONFIG_HAVE_DYNAMIC_FTRACE=y
5457:CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
5458:CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y
5459:CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=y
5460:CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
5472:CONFIG_FTRACE=y
5476:CONFIG_DYNAMIC_FTRACE=y
5477:CONFIG_DYNAMIC_FTRACE_WITH_REGS=y
5478:CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y
5485:CONFIG_FTRACE_SYSCALLS=y
5498:CONFIG_FTRACE_MCOUNT_RECORD=y
5499:CONFIG_FTRACE_MCOUNT_USE_CC=y
5506:# CONFIG_FTRACE_RECORD_RECURSION is not set
5507:# CONFIG_FTRACE_STARTUP_TEST is not set

Is there some way to check if a fallback is used from a succesful probe (e.g. openat; verbose mode output?) or test e.g. if perf trace works I'm good?

this fallback is not yet implemented in libbpf; the patch I referred is still under development

As for just silently ignoring -ENOENT errors from skeleton, that's too fragile and error-prone.

Well, I think we need a way to flag some syscalls as optionals so it's ok if they're missing -- I'm not sure why you're expecting the libbpf program to "know" that there is no open syscall on arm64.

Yes, I certainly expect application to know that something that it tries to attach to doesn't exist. Otherwise there will be a lot of confusion when libbpf silently and unexpectedly doesn't attach to desired BPF hook.

The expectation is that user-space application will do necessary feature probing to determine what the system at hands support, and if some BPF programs can't be successfully loaded and/or attached, you can disable it's loading and attaching with:

bpf_program__set_autoload(my_prog, false);

For example I'd expect the probe to fail to attach if I make a typo and request to attach on "sys_enter_opneat"; so why would the lack of tracepoint/syscalls/sys_enter_open be OK for arm64?

I'm not following. Who says that the lack of that tracepoint is OK?

martinetd commented 3 years ago

FWIW, here's snippets from my config:

Thanks, I think that part is good.

this fallback is not yet implemented in libbpf; the patch I referred is still under development

Sorry, I'm confused now. Your first reply mentioned a fallback for kprobe to legacy kprobes that is in development, and another fallback for tracepoints that sounded like an independent, already there thing. I don't think that's the problem anyway, let's drop this unless you think it's relevant.

Yes, I certainly expect application to know that something that it tries to attach to doesn't exist. Otherwise there will be a lot of confusion when libbpf silently and unexpectedly doesn't attach to desired BPF hook.

I most definitely agree so far.

The expectation is that user-space application will do necessary feature probing to determine what the system at hands support, and if some BPF programs can't be successfully loaded and/or attached, you can disable it's loading and attaching with:

bpf_program__set_autoload(my_prog, false);

Ok, so, taking the easy way out of checking arch instead of sysfs files I could just do something like this?

diff --git a/libbpf-tools/opensnoop.c b/libbpf-tools/opensnoop.c
index 104783b5e47f..128c8342433d 100644
--- a/libbpf-tools/opensnoop.c
+++ b/libbpf-tools/opensnoop.c
@@ -242,6 +242,11 @@ int main(int argc, char **argv)
        obj->rodata->targ_uid = env.uid;
        obj->rodata->targ_failed = env.failed;

+#ifdef __aarch64__
+       bpf_program__set_autoload(obj->progs.tracepoint__syscalls__sys_enter_open, false);
+       bpf_program__set_autoload(obj->progs.tracepoint__syscalls__sys_exit_open, false);
+#endif
+
        err = opensnoop_bpf__load(obj);
        if (err) {
                fprintf(stderr, "failed to load BPF object: %d\n", err);

(can't test right now)

And what you're referring to as checking in /sys would be something like access("/sys/kernel/debug/tracing/events/syscalls/sys_enter_open", R_OK) as an explicit check and disable loading before trying to load like this if the probe is optional? That feels a bit cumbersome to expect everyone to know what to check (what files in /sys, or I guess kallsyms might work for most kprobes?) but most of the difficulties I'm thinking of (for example /sys/kernel/debug not mounted or insufficient permission) are just as difficult to handle in libbpf so I guess I can relate... In this case openat would fail as well if there is a more general problem so we might get away with just an access() call but in general it might actually be better to check by arch as I did here, I don't know.

anakryiko commented 3 years ago

this fallback is not yet implemented in libbpf; the patch I referred is still under development

Sorry, I'm confused now. Your first reply mentioned a fallback for kprobe to legacy kprobes that is in development, and another fallback for tracepoints that sounded like an independent, already there thing.

Ok, sorry for the confusion. There is kprobe fallback in development. No tracepoint fallback is in development, but it should be very similar to kprobe.

I don't think that's the problem anyway, let's drop this unless you think it's relevant.

I'm still trying to understand why that tracepoint is missing. It seems strange that just one particular architecture doesn't have the syscalls/sys_enter_open tracepoint. If it has open() syscall, it should have that tracepoint.

So do you see something like below?

# sudo cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/format
name: sys_enter_open
ID: 583
format:
        field:unsigned short common_type;       offset:0;       size:2; signed:0;
        field:unsigned char common_flags;       offset:2;       size:1; signed:0;
        field:unsigned char common_preempt_count;       offset:3;       size:1; signed:0;
        field:int common_pid;   offset:4;       size:4; signed:1;

        field:int __syscall_nr; offset:8;       size:4; signed:1;
        field:const char * filename;    offset:16;      size:8; signed:0;
        field:int flags;        offset:24;      size:8; signed:0;
        field:umode_t mode;     offset:32;      size:8; signed:0;

print fmt: "filename: 0x%08lx, flags: 0x%08lx, mode: 0x%08lx", ((unsigned long)(REC->filename)), ((unsigned long)(REC->flags)), ((unsigned long)(REC->mode))

The expectation is that user-space application will do necessary feature probing to determine what the system at hands support, and if some BPF programs can't be successfully loaded and/or attached, you can disable it's loading and attaching with: bpf_program__set_autoload(my_prog, false);

Ok, so, taking the easy way out of checking arch instead of sysfs files I could just do something like this?

Yes you could, but that would also defy the point of opensnoop tool...

In this case openat would fail as well if there is a more general problem so we might get away with just an access() call but in general it might actually be better to check by arch as I did here, I don't know.

Yes, that's my point. That there might be some more general problem, so libbpf shouldn't just ignore problems with some of BPF programs. If user specified that they want to load some program, it has to load successfully.

If your point is that libbpf should allow specifying that some BPF programs are optional, then sure, as a feature it makes sense, as long as it's a conscious decision on user's part.

In this specific case, though, it seems like it's some general problem, so would be good to get to the bottom of it first.

martinetd commented 3 years ago

I'm still trying to understand why that tracepoint is missing. It seems strange that just one particular architecture doesn't have the syscalls/sys_enter_open tracepoint. If it has open() syscall, it should have that tracepoint.

So do you see something like below?

No, there is no open there:

# ls  /sys/kernel/debug/tracing/events/syscalls/ | grep enter_open
sys_enter_open_by_handle_at
sys_enter_open_tree
sys_enter_openat
sys_enter_openat2

Yes you could, but that would also defy the point of opensnoop tool...

I don't think so, from what I noticed it doesn't miss anything; the libc sends all kind of opens through openat; I let bpftrace -e 'tracepoint:syscalls:sys_enter_open* { printf("caught %s\n", probe); }' run for a while and it never ran into openat2 either -- but I could trigger it by running syscall(__NR_openat2... Another test on kprobe:do_sys_open didn't catch anything either, and looking at verious syscall.h and similar I couldn't find what to use (there's __NR_open defined as 5 in arch/arm64/include/asm/unistd32.h but 5 is setxattr on aarch64, perhaps it would be attainable in 32byte emulation mode?)

Well, either way for my purposes just catching openat is good enough; I'm not trying to fake-seccomp something just occasionally debug what normal programs try to open.

If your point is that libbpf should allow specifying that some BPF programs are optional, then sure, as a feature it makes sense, as long as it's a conscious decision on user's part.

Yes, that's what I had meant. Let's get to the end of the current issue first though as you say.

anakryiko commented 3 years ago

I'm still trying to understand why that tracepoint is missing. It seems strange that just one particular architecture doesn't have the syscalls/sys_enter_open tracepoint. If it has open() syscall, it should have that tracepoint. So do you see something like below?

No, there is no open there:

# ls  /sys/kernel/debug/tracing/events/syscalls/ | grep enter_open
sys_enter_open_by_handle_at
sys_enter_open_tree
sys_enter_openat
sys_enter_openat2

Yes you could, but that would also defy the point of opensnoop tool...

I don't think so, from what I noticed it doesn't miss anything; the libc sends all kind of opens through openat; I let bpftrace -e 'tracepoint:syscalls:sys_enter_open* { printf("caught %s\n", probe); }' run for a while and it never ran into openat2 either -- but I could trigger it by running syscall(__NR_openat2... Another test on kprobe:do_sys_open didn't catch anything either, and looking at verious syscall.h and similar I couldn't find what to use (there's __NR_open defined as 5 in arch/arm64/include/asm/unistd32.h but 5 is setxattr on aarch64, perhaps it would be attainable in 32byte emulation mode?)

So I think aarch64 never had __NR_open defined (see also [0]), so it makes sense that there is no such tracepoint. In that sense, yes, it's totally ok to not trace sys_enter_open at all. By "defy the point" I meant that if open() syscall exists and it's not traced, then such tool is clearly missing something. But if there is no open() at all, then it's totally reasonable to just skip attaching to this tracepoint.

[0] https://stackoverflow.com/questions/55403236/why-is-the-open-syscall-supported-on-some-linux-systems-and-not-others

Well, either way for my purposes just catching openat is good enough; I'm not trying to fake-seccomp something just occasionally debug what normal programs try to open.

If your point is that libbpf should allow specifying that some BPF programs are optional, then sure, as a feature it makes sense, as long as it's a conscious decision on user's part.

Yes, that's what I had meant. Let's get to the end of the current issue first though as you say.

Right, so I think we figured it out. On aarch64 there is no need to attach to sys_enter_open. I'd suggest just doing explicit disable of sys_enter_open and sys_exit_open, just like you did in https://github.com/iovisor/bcc/issues/3344#issuecomment-812337102. Do you mind submitting a PR? Please also add comment in code explaining why we do this on aarch64.