Closed Jianchen-Shan closed 1 year ago
Hi, please supply the output of bpftrace --info
. And the contents of vfsstat.bt
you're running. Calls to bpf_seq_write
are quite odd -- I don't think the vfsstat.bt we have checked into the tree uses that
Hi danobi,
I appreciate your response. I use vfsstat.bt as an exmaple, the same issue happens on other tools. For example,
ubuntu@myvm:~/bpftrace/tools$ sudo ./runqlen.bt -v
INFO: node count: 41
Attaching 2 probes...
Error log:
0: R1=ctx(off=0,imm=0) R10=fp0
0: (b7) r6 = 0 ; R6_w=0
1: (7b) *(u64 *)(r10 -16) = r6 ; R6_w=P0 R10=fp0 fp-16_w=00000000
2: (18) r1 = 0xffff93d8f191dc00 ; R1_w=map_ptr(off=0,ks=0,vs=0,imm=0)
4: (bf) r2 = r10 ; R2_w=fp0 R10=fp0
5: (07) r2 += -16 ; R2_w=fp-16
6: (b7) r3 = 8 ; R3_w=8
7: (b7) r4 = 0 ; R4_w=0
8: (85) call bpf_seq_write#130
unknown func bpf_seq_write#130
processed 8 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
ERROR: Error loading program: BEGIN
Here is the contents of vfsstat.bt
/*
* vfsstat Count some VFS calls, with per-second summaries.
* For Linux, uses bpftrace and eBPF.
*
* Written as a basic example of counting multiple events and printing a
* per-second summary.
*
* USAGE: vfsstat.bt
*
* This is a bpftrace version of the bcc tool of the same name.
*
* Copyright 2018 Netflix, Inc.
* Licensed under the Apache License, Version 2.0 (the "License")
*
* 06-Sep-2018 Brendan Gregg Created this.
*/
BEGIN
{
printf("Tracing key VFS calls... Hit Ctrl-C to end.\n");
}
kprobe:vfs_read*,
kprobe:vfs_write*,
kprobe:vfs_fsync,
kprobe:vfs_open,
kprobe:vfs_create
{
@[func] = count();
}
interval:s:1
{
time();
print(@);
clear(@);
}
END
{
clear(@);
}
Here is an output of the bpftrace --info
ubuntu@myvm:~/bpftrace/build$ sudo bpftrace --info
System
OS: Linux 6.1.36BPF #15 SMP PREEMPT_DYNAMIC Sat Sep 30 23:16:32 EDT 2023
Arch: x86_64
Build
version: v0.19.0-9-gea56a
LLVM: 14.0.0
unsafe probe: no
bfd: yes
libdw (DWARF support): yes
Kernel helpers
probe_read: yes
probe_read_str: yes
probe_read_user: yes
probe_read_user_str: yes
probe_read_kernel: yes
probe_read_kernel_str: yes
get_current_cgroup_id: yes
send_signal: yes
override_return: yes
get_boot_ns: no
dpath: no
skboutput: yes
get_tai_ns: no
get_func_ip: yes
jiffies64: yes
Kernel features
Instruction limit: 1000000
Loop support: yes
btf: yes
module btf: yes
map batch: yes
uprobe refcount (depends on Build:bcc bpf_attach_uprobe refcount): yes
Map types
hash: yes
percpu hash: yes
array: yes
percpu array: yes
stack_trace: yes
perf_event_array: yes
ringbuf: yes
Probe types
kprobe: yes
tracepoint: yes
perf_event: yes
kfunc: yes
kprobe_multi: yes
raw_tp_special: yes
iter: yes
Basically, the error happens when any tool tries to print (from the bpf program to the console, i.e., user space) in my patched kernel. Therefore, I tried to implement this by myself in c with the ring buff. But I found the same issue happens with the following test:
ubuntu@myvm:~/bpftrace/bcc/examples/ringbuf$ sudo ./ringbuf_output.py
bpf: Failed to load program: Invalid argument
0: R1=ctx(off=0,imm=0) R10=fp0
0: (bf) r6 = r1 ; R1=ctx(off=0,imm=0) R6_w=ctx(off=0,imm=0)
1: (b7) r1 = 0 ; R1_w=0
2: (63) *(u32 *)(r10 -8) = r1 ; R1_w=P0 R10=fp0 fp-8=????0000
3: (7b) *(u64 *)(r10 -16) = r1 ; R1_w=P0 R10=fp0 fp-16_w=00000000
4: (7b) *(u64 *)(r10 -24) = r1 ; R1_w=P0 R10=fp0 fp-24_w=00000000
5: (7b) *(u64 *)(r10 -32) = r1 ; R1_w=P0 R10=fp0 fp-32_w=00000000
6: (79) r3 = *(u64 *)(r6 +24) ; R3_w=scalar() R6_w=ctx(off=0,imm=0)
7: (bf) r7 = r10 ; R7_w=fp0 R10=fp0
8: (07) r7 += -32 ; R7_w=fp-32
9: (bf) r1 = r7 ; R1_w=fp-32 R7_w=fp-32
10: (b7) r2 = 16 ; R2_w=16
11: (85) call bpf_probe_read_user_str#114 ; R0_w=scalar(smin=-4095,smax=16) fp-24_w=mmmmmmmm fp-32_w=mmmmmmmm
12: (79) r1 = *(u64 *)(r6 +16) ; R1_w=scalar() R6_w=ctx(off=0,imm=0)
13: (63) *(u32 *)(r10 -16) = r1 ; R1_w=scalar() R10=fp0 fp-16_w=0000mmmm
14: (79) r1 = *(u64 *)(r6 +32) ; R1_w=scalar() R6_w=ctx(off=0,imm=0)
15: (63) *(u32 *)(r10 -12) = r1 ; R1_w=scalar() R10=fp0 fp-16_w=mmmmmmmm
16: (69) r1 = *(u16 *)(r6 +40) ; R1_w=scalar(umax=65535,var_off=(0x0; 0xffff)) R6_w=ctx(off=0,imm=0)
17: (63) *(u32 *)(r10 -8) = r1 ; R1_w=scalar(umax=65535,var_off=(0x0; 0xffff)) R10=fp0 fp-8=
18: (18) r1 = 0xffff93d8f25a9000 ; R1_w=map_ptr(off=0,ks=0,vs=0,imm=0)
20: (bf) r2 = r7 ; R2_w=fp-32 R7_w=fp-32
21: (b7) r3 = 28 ; R3_w=28
22: (b7) r4 = 0 ; R4_w=0
23: (85) call bpf_seq_write#130
unknown func bpf_seq_write#130
processed 23 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
Traceback (most recent call last):
File "/home/ubuntu/bpftrace/bcc/examples/ringbuf/./ringbuf_output.py", line 35, in <module>
b = BPF(text=src)
File "/usr/lib/python3/dist-packages/bcc-0.28.0+66cb619e-py3.10.egg/bcc/__init__.py", line 487, in __init__
File "/usr/lib/python3/dist-packages/bcc-0.28.0+66cb619e-py3.10.egg/bcc/__init__.py", line 1475, in _trace_autoload
File "/usr/lib/python3/dist-packages/bcc-0.28.0+66cb619e-py3.10.egg/bcc/__init__.py", line 526, in load_func
Exception: Failed to load BPF program b'tracepoint__syscalls__sys_enter_openat': Invalid argument
I am not sure how my changes in the kernel caused this issue. So I am also attaching my kernel source tree here: https://github.com/Jianchen-Shan/linux-6.1.36.git (I added a new bpf prog type "sched" and defined multiple hooks in this type.) Any insights are appreciated.
Indeed weird. I'll try to take a closer look this week. In meantime, does the issue reproduce on your custom kernel without your patches? If not, then it's the changes you're making right?
Yes. All bpftrace tools work fine without my kernel patch. But the libbpf tools work fine even with the patch being applied. The patch somehow hurts bpftrace tools more than it does to libbpf tools.
That's why I am really curious about this issue and posted it here to understand what is going on. I guess the other question is how developers can safely patch their kernel, especially when they are making changes to bpf-related files, without breaking tools like bpftrace as I did here.
IIUC, bpftrace is using 130 as the func_id for bpf_ringbuf_output
, while your kernel think 130 should be the func_id for bpf_seq_write
. Since the latter is only available for certain program types, the verifier rejects the function call.
So I guess the root cause is the mismatched func_ids in your kernel source and in libbpf code for compiling bpftrace, i.e. the extra three sched_entity_*
functions inserted in the include/uapi/linux/bpf.h
.
You can try to move these three sched_entity_*
functions to the end of the list, and recompile the kernel.
Links:
Hi @xh4n3,
Thanks for your explanation. The proposed solution fixed the issue after recompiling the kernel with the new order.
The changes I made in my custom kernel can be seen here: https://github.com/Jianchen-Shan/linux-6.1.36/commit/bc33f3828f9b269f24ab92c955f6bc4fd7349873
The key takeaway for those who read this in the future: always add the new bpf helper functions toward the end to avoid breaking the tools in bcc/bpftrace.
To @danobi, thanks for your help, too. I think this issue can be closed now.
Discussed in https://github.com/iovisor/bpftrace/discussions/2786