unknown function "bpf_seq_write"

Jianchen-Shan commented 1 year ago

Discussed in https://github.com/iovisor/bpftrace/discussions/2786

^{Originally posted by **Jianchen-Shan** October 4, 2023} Hello I am new to bpftrace and I am encountering an issue when using it after I customized my kernel with a patch found here: https://lore.kernel.org/bpf/CA+khW7i460ey-UFzpMSJ8AP9QeD8ufa4FzLA4PQckNP00ShQSw@mail.gmail.com/T/#m14c6a514541bbaad87768cdbcded6b7b8778fa2d As you can see from the output below, the examples show that one tool (e.g., vfsstat) can run correctly under the bcc directory, but they would fail under bpftrace/tools with an error complaining about an unknown function. Can anyone give me some insights? I don't think the customized kernel is an issue since the BCC tools work well. I built bpftrace from the source as shown in the installation instructions on the ubuntu machine. I appreciate your help. ubuntu@myvm:~/bpftrace/bcc/tools$ sudo ./vfsstat.py TIME READ/s WRITE/s FSYNC/s OPEN/s CREATE/s 00:15:57: 1330 3 0 19 0 00:15:58: 34 3 0 16 0 00:15:59: 70 3 0 19 0 ubuntu@myvm:~/bpftrace/bcc/libbpf-tools$ sudo ./vfsstat TIME READ/s WRITE/s FSYNC/s OPEN/s CREATE/s 00:16:47: 40 5 0 17 0 00:16:48: 71 4 0 19 0 ubuntu@myvm:~/bpftrace/tools$ sudo ./vfsstat.bt -v INFO: node count: 31 Attaching 8 probes... Error log: 0: R1=ctx(off=0,imm=0) R10=fp0 0: (b7) r6 = 0 ; R6_w=0 1: (7b) *(u64 *)(r10 -16) = r6 ; R6_w=P0 R10=fp0 fp-16_w=00000000 2: (18) r1 = 0xffff93d8ef573a00 ; R1_w=map_ptr(off=0,ks=0,vs=0,imm=0) 4: (bf) r2 = r10 ; R2_w=fp0 R10=fp0 5: (07) r2 += -16 ; R2_w=fp-16 6: (b7) r3 = 8 ; R3_w=8 7: (b7) r4 = 0 ; R4_w=0 8: (85) call bpf_seq_write#130 unknown func bpf_seq_write#130 processed 8 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 ERROR: Error loading program: BEGIN

danobi commented 1 year ago

Hi, please supply the output of bpftrace --info. And the contents of vfsstat.bt you're running. Calls to bpf_seq_write are quite odd -- I don't think the vfsstat.bt we have checked into the tree uses that

Jianchen-Shan commented 1 year ago

Hi danobi,

I appreciate your response. I use vfsstat.bt as an exmaple, the same issue happens on other tools. For example,

ubuntu@myvm:~/bpftrace/tools$ sudo ./runqlen.bt -v
INFO: node count: 41
Attaching 2 probes...

Error log: 
0: R1=ctx(off=0,imm=0) R10=fp0
0: (b7) r6 = 0                        ; R6_w=0
1: (7b) *(u64 *)(r10 -16) = r6        ; R6_w=P0 R10=fp0 fp-16_w=00000000
2: (18) r1 = 0xffff93d8f191dc00       ; R1_w=map_ptr(off=0,ks=0,vs=0,imm=0)
4: (bf) r2 = r10                      ; R2_w=fp0 R10=fp0
5: (07) r2 += -16                     ; R2_w=fp-16
6: (b7) r3 = 8                        ; R3_w=8
7: (b7) r4 = 0                        ; R4_w=0
8: (85) call bpf_seq_write#130
unknown func bpf_seq_write#130
processed 8 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0

ERROR: Error loading program: BEGIN

Here is the contents of vfsstat.bt


/*
 * vfsstat      Count some VFS calls, with per-second summaries.
 *              For Linux, uses bpftrace and eBPF.
 *
 * Written as a basic example of counting multiple events and printing a
 * per-second summary.
 *
 * USAGE: vfsstat.bt
 *
 * This is a bpftrace version of the bcc tool of the same name.
 *
 * Copyright 2018 Netflix, Inc.
 * Licensed under the Apache License, Version 2.0 (the "License")
 *
 * 06-Sep-2018  Brendan Gregg   Created this.
 */

BEGIN
{
        printf("Tracing key VFS calls... Hit Ctrl-C to end.\n");

}

kprobe:vfs_read*,
kprobe:vfs_write*,
kprobe:vfs_fsync,
kprobe:vfs_open,
kprobe:vfs_create
{
        @[func] = count();
}

interval:s:1
{
        time();
        print(@);
        clear(@);
}

END
{
        clear(@);
}

Here is an output of the bpftrace --info

ubuntu@myvm:~/bpftrace/build$ sudo bpftrace --info
System
  OS: Linux 6.1.36BPF #15 SMP PREEMPT_DYNAMIC Sat Sep 30 23:16:32 EDT 2023
  Arch: x86_64

Build
  version: v0.19.0-9-gea56a
  LLVM: 14.0.0
  unsafe probe: no
  bfd: yes
  libdw (DWARF support): yes

Kernel helpers
  probe_read: yes
  probe_read_str: yes
  probe_read_user: yes
  probe_read_user_str: yes
  probe_read_kernel: yes
  probe_read_kernel_str: yes
  get_current_cgroup_id: yes
  send_signal: yes
  override_return: yes
  get_boot_ns: no
  dpath: no
  skboutput: yes
  get_tai_ns: no
  get_func_ip: yes
  jiffies64: yes

Kernel features
  Instruction limit: 1000000
  Loop support: yes
  btf: yes
  module btf: yes
  map batch: yes
  uprobe refcount (depends on Build:bcc bpf_attach_uprobe refcount): yes

Map types
  hash: yes
  percpu hash: yes
  array: yes
  percpu array: yes
  stack_trace: yes
  perf_event_array: yes
  ringbuf: yes

Probe types
  kprobe: yes
  tracepoint: yes
  perf_event: yes
  kfunc: yes
  kprobe_multi: yes
  raw_tp_special: yes
  iter: yes

Jianchen-Shan commented 1 year ago

Basically, the error happens when any tool tries to print (from the bpf program to the console, i.e., user space) in my patched kernel. Therefore, I tried to implement this by myself in c with the ring buff. But I found the same issue happens with the following test:

ubuntu@myvm:~/bpftrace/bcc/examples/ringbuf$ sudo ./ringbuf_output.py 
bpf: Failed to load program: Invalid argument
0: R1=ctx(off=0,imm=0) R10=fp0
0: (bf) r6 = r1                       ; R1=ctx(off=0,imm=0) R6_w=ctx(off=0,imm=0)
1: (b7) r1 = 0                        ; R1_w=0
2: (63) *(u32 *)(r10 -8) = r1         ; R1_w=P0 R10=fp0 fp-8=????0000
3: (7b) *(u64 *)(r10 -16) = r1        ; R1_w=P0 R10=fp0 fp-16_w=00000000
4: (7b) *(u64 *)(r10 -24) = r1        ; R1_w=P0 R10=fp0 fp-24_w=00000000
5: (7b) *(u64 *)(r10 -32) = r1        ; R1_w=P0 R10=fp0 fp-32_w=00000000
6: (79) r3 = *(u64 *)(r6 +24)         ; R3_w=scalar() R6_w=ctx(off=0,imm=0)
7: (bf) r7 = r10                      ; R7_w=fp0 R10=fp0
8: (07) r7 += -32                     ; R7_w=fp-32
9: (bf) r1 = r7                       ; R1_w=fp-32 R7_w=fp-32
10: (b7) r2 = 16                      ; R2_w=16
11: (85) call bpf_probe_read_user_str#114     ; R0_w=scalar(smin=-4095,smax=16) fp-24_w=mmmmmmmm fp-32_w=mmmmmmmm
12: (79) r1 = *(u64 *)(r6 +16)        ; R1_w=scalar() R6_w=ctx(off=0,imm=0)
13: (63) *(u32 *)(r10 -16) = r1       ; R1_w=scalar() R10=fp0 fp-16_w=0000mmmm
14: (79) r1 = *(u64 *)(r6 +32)        ; R1_w=scalar() R6_w=ctx(off=0,imm=0)
15: (63) *(u32 *)(r10 -12) = r1       ; R1_w=scalar() R10=fp0 fp-16_w=mmmmmmmm
16: (69) r1 = *(u16 *)(r6 +40)        ; R1_w=scalar(umax=65535,var_off=(0x0; 0xffff)) R6_w=ctx(off=0,imm=0)
17: (63) *(u32 *)(r10 -8) = r1        ; R1_w=scalar(umax=65535,var_off=(0x0; 0xffff)) R10=fp0 fp-8=
18: (18) r1 = 0xffff93d8f25a9000      ; R1_w=map_ptr(off=0,ks=0,vs=0,imm=0)
20: (bf) r2 = r7                      ; R2_w=fp-32 R7_w=fp-32
21: (b7) r3 = 28                      ; R3_w=28
22: (b7) r4 = 0                       ; R4_w=0
23: (85) call bpf_seq_write#130
unknown func bpf_seq_write#130
processed 23 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0

Traceback (most recent call last):
  File "/home/ubuntu/bpftrace/bcc/examples/ringbuf/./ringbuf_output.py", line 35, in <module>
    b = BPF(text=src)
  File "/usr/lib/python3/dist-packages/bcc-0.28.0+66cb619e-py3.10.egg/bcc/__init__.py", line 487, in __init__
  File "/usr/lib/python3/dist-packages/bcc-0.28.0+66cb619e-py3.10.egg/bcc/__init__.py", line 1475, in _trace_autoload
  File "/usr/lib/python3/dist-packages/bcc-0.28.0+66cb619e-py3.10.egg/bcc/__init__.py", line 526, in load_func
Exception: Failed to load BPF program b'tracepoint__syscalls__sys_enter_openat': Invalid argument

I am not sure how my changes in the kernel caused this issue. So I am also attaching my kernel source tree here: https://github.com/Jianchen-Shan/linux-6.1.36.git (I added a new bpf prog type "sched" and defined multiple hooks in this type.) Any insights are appreciated.

danobi commented 1 year ago

Indeed weird. I'll try to take a closer look this week. In meantime, does the issue reproduce on your custom kernel without your patches? If not, then it's the changes you're making right?

Jianchen-Shan commented 1 year ago

Yes. All bpftrace tools work fine without my kernel patch. But the libbpf tools work fine even with the patch being applied. The patch somehow hurts bpftrace tools more than it does to libbpf tools.

That's why I am really curious about this issue and posted it here to understand what is going on. I guess the other question is how developers can safely patch their kernel, especially when they are making changes to bpf-related files, without breaking tools like bpftrace as I did here.

xh4n3 commented 1 year ago

IIUC, bpftrace is using 130 as the func_id for bpf_ringbuf_output, while your kernel think 130 should be the func_id for bpf_seq_write. Since the latter is only available for certain program types, the verifier rejects the function call.

So I guess the root cause is the mismatched func_ids in your kernel source and in libbpf code for compiling bpftrace, i.e. the extra three sched_entity_* functions inserted in the include/uapi/linux/bpf.h.

You can try to move these three sched_entity_* functions to the end of the list, and recompile the kernel.

Links:

bpf_seq_write is 130th function in your kernel source https://github.com/Jianchen-Shan/linux-6.1.36/blob/main/include/uapi/linux/bpf.h#L5592C8-L5592C8
bpf_ringbuf_output's func_id is 130 in the libbpf repo https://github.com/libbpf/libbpf/blob/master/src/bpf_helper_defs.h#L3159

Jianchen-Shan commented 1 year ago

Hi @xh4n3,

Thanks for your explanation. The proposed solution fixed the issue after recompiling the kernel with the new order.

The changes I made in my custom kernel can be seen here: https://github.com/Jianchen-Shan/linux-6.1.36/commit/bc33f3828f9b269f24ab92c955f6bc4fd7349873

The key takeaway for those who read this in the future: always add the new bpf helper functions toward the end to avoid breaking the tools in bcc/bpftrace.

To @danobi, thanks for your help, too. I think this issue can be closed now.

bpftrace / bpftrace

unknown function "bpf_seq_write" #2787

Discussed in https://github.com/iovisor/bpftrace/discussions/2786