foss-for-synopsys-dwc-arc-processors / linux

Helpful resources for users & developers of Linux kernel for ARC
22 stars 13 forks source link

eBPF: Freezes on bootstrap example #139

Closed kolerov closed 5 months ago

kolerov commented 1 year ago

I used guides in documentation and arc-bpf-testbench repository to build Linux with support of eBPF. Also, I built bootstrap example from the same repository. I tried to in Linux on QEMU. Then I connected to Linux using SSH from another terminal to trigger eBPF events but then QEMU freezes and I see this in the main terminal:

# ./bootstrap
TIME     EVENT COMM             PID     PPID    FILENAME/EXIT CODE

rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
        (detected by 0, t=2102 jiffies, g=3001, q=3)
rcu: All QSes seen, last rcu_preempt kthread activity 2102 (139188-137086), jiffies_till_next_fqs=1, root ->qsmask 0x0
rcu: rcu_preempt kthread starved for 2102 jiffies! g3001 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
rcu:    Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt     state:R  running task     stack:    0 pid:   13 ppid:     2 flags:0x00000000

Stack Trace:
rcu: Stack dump where RCU GP kthread last ran:
Task dump for CPU 0:
task:sshd            state:R  running task     stack:    0 pid:  162 ppid:   106 flags:0x00000008

Stack Trace:
  arc_unwind_core+0xe8/0x110
  rcu_check_gp_kthread_starvation+0xb8/0xdc
  rcu_sched_clock_irq+0x960/0xc28
  update_process_times+0x80/0xa8
  tick_sched_timer+0x40/0x9c
  __hrtimer_run_queues.constprop.0+0x1c8/0x2f0
  hrtimer_interrupt+0x102/0x2e8
  timer_irq_handler+0x18/0x20
  __handle_irq_event_percpu+0x6e/0x1a0
  handle_irq_event_percpu+0xc/0x40
  handle_percpu_irq+0x2e/0x4c
  generic_handle_domain_irq+0x32/0x74
  arch_do_IRQ+0x28/0x40
  ret_from_exception+0x0/0x8

I'm going to try the same scenario on HSDK.

shahab-vahedi commented 1 year ago

Trying it on HSDK would be the next logical step. Nevertheless, could you also dump the JITed code here. You have to enable the debugging mode in JIT:

$ sysctl net.core.bpf_jit_enable=2

And then continue with the bootstrap example.

EDIT: bootstrap.txt

kolerov commented 1 year ago

The problem is in this line:

BPF_CORE_READ(task, real_parent, tgid);

Note that without JITing bootstrap example works fine but bpf_probe_read_kernel_str doesn't work properly - filename is not read from kernel space and it's empty in output. The same issue happens with JIT too (if BPF_CORE_READ line is deleted).

shahab-vahedi commented 1 year ago

Could you dump the VM/JIT bytes here with your reduced example?

kolerov commented 1 year ago

Here is the dump for a reduced example. The reduced example bootstrap_handle_exec.bpf.c may be found is a separate branch. Also this reduced version of bootstrap leads to "Oops".

# ./bootstrap_handle_exec
-----------------[  VM   ]-----------------
0x85, 0x00, 0x00, 0x00, 0xe4, 0x06, 0x02, 0x00
0x95, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
-----------------[ jited ]-----------------
0xfc, 0x1c, 0xc8, 0xb7, 0xfc, 0x1c, 0xc8, 0xb5
0xfc, 0x1c, 0x88, 0xb5, 0x0a, 0x22, 0x80, 0x1f
0xfa, 0x80, 0x9c, 0xda, 0x22, 0x20, 0x80, 0x02
0x00, 0x24, 0x9c, 0x3f, 0x00, 0x00, 0x08, 0x00
0x0a, 0x20, 0x00, 0x10, 0x0a, 0x21, 0x40, 0x10
0x04, 0x14, 0x1f, 0x34, 0x0a, 0x20, 0x00, 0x02
0x0a, 0x21, 0x40, 0x02, 0xe0, 0x20, 0xc0, 0x07

-----------------[  VM   ]-----------------
0xb7, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
0x95, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
-----------------[ jited ]-----------------
0x8a, 0x20, 0x00, 0x10, 0x8a, 0x21, 0x00, 0x10
0x0a, 0x20, 0x00, 0x02, 0x0a, 0x21, 0x40, 0x02
0xe0, 0x20, 0xc0, 0x07

-----------------[  VM   ]-----------------
0xb7, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
0x95, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
-----------------[ jited ]-----------------
0x8a, 0x20, 0x00, 0x10, 0x8a, 0x21, 0x00, 0x10
0x0a, 0x20, 0x00, 0x02, 0x0a, 0x21, 0x40, 0x02
0xe0, 0x20, 0xc0, 0x07

-----------------[  VM   ]-----------------
0x18, 0x01, 0x00, 0x00, 0x20, 0xdf, 0x67, 0x82
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
0x7a, 0x01, 0x00, 0x00, 0x2a, 0x00, 0x00, 0x00
0xb7, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
0x95, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
-----------------[ jited ]-----------------
0x0a, 0x20, 0x80, 0x0f, 0x67, 0x82, 0x20, 0xdf
0x0a, 0x21, 0x80, 0x0f, 0x00, 0x00, 0x00, 0x00
0x8a, 0x23, 0x80, 0x1a, 0x00, 0x18, 0xc0, 0x02
0x8a, 0x23, 0x00, 0x10, 0x04, 0x18, 0xc0, 0x02
0x8a, 0x20, 0x00, 0x10, 0x8a, 0x21, 0x00, 0x10
0x0a, 0x20, 0x00, 0x02, 0x0a, 0x21, 0x40, 0x02
0xe0, 0x20, 0xc0, 0x07

-----------------[  VM   ]-----------------
0xbf, 0xa1, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
0x07, 0x01, 0x00, 0x00, 0xf8, 0xff, 0xff, 0xff
0xb7, 0x02, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00
0xb7, 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
0x85, 0x00, 0x00, 0x00, 0xe8, 0x6a, 0xff, 0xff
0x95, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
-----------------[ jited ]-----------------
0xfc, 0x1c, 0xc8, 0xb7, 0x0a, 0x20, 0xc0, 0x06
0x8a, 0x21, 0x00, 0x00, 0x8a, 0x22, 0x3f, 0x1e
0x8a, 0x23, 0xff, 0x1f, 0x00, 0x20, 0x80, 0x82
0x01, 0x21, 0xc1, 0x02, 0x8a, 0x22, 0x00, 0x02
0x8a, 0x23, 0x00, 0x00, 0x8a, 0x24, 0x00, 0x00
0x8a, 0x25, 0x00, 0x00, 0xfc, 0x1c, 0xc8, 0xb5
0xfc, 0x1c, 0x88, 0xb5, 0x0a, 0x22, 0x80, 0x1f
0xf8, 0x80, 0xa0, 0x3e, 0x22, 0x20, 0x80, 0x02
0x00, 0x24, 0x9c, 0x3f, 0x00, 0x00, 0x08, 0x00
0x0a, 0x20, 0x00, 0x10, 0x0a, 0x21, 0x40, 0x10
0x04, 0x14, 0x1f, 0x34, 0x0a, 0x20, 0x00, 0x02
0x0a, 0x21, 0x40, 0x02, 0xe0, 0x20, 0xc0, 0x07

-----------------[  VM   ]-----------------
0x18, 0x01, 0x00, 0x00, 0x00, 0x7e, 0x6e, 0x82
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
0xb7, 0x02, 0x00, 0x00, 0xa8, 0x00, 0x00, 0x00
0xb7, 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
0x85, 0x00, 0x00, 0x00, 0xd8, 0xa1, 0x02, 0x00
0xbf, 0x06, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
0x15, 0x06, 0x18, 0x00, 0x00, 0x00, 0x00, 0x00
0x85, 0x00, 0x00, 0x00, 0xd4, 0x65, 0xff, 0xff
0xbf, 0x07, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
0x85, 0x00, 0x00, 0x00, 0xa4, 0x06, 0x02, 0x00
0x77, 0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00
0x63, 0x06, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
0xb7, 0x01, 0x00, 0x00, 0x54, 0x02, 0x00, 0x00
0x0f, 0x17, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
0xbf, 0xa1, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
0x07, 0x01, 0x00, 0x00, 0xf0, 0xff, 0xff, 0xff
0xb7, 0x02, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00
0xbf, 0x73, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
0x85, 0x00, 0x00, 0x00, 0xe8, 0x6a, 0xff, 0xff
0xb7, 0x01, 0x00, 0x00, 0x50, 0x02, 0x00, 0x00
0x79, 0xa3, 0xf0, 0xff, 0x00, 0x00, 0x00, 0x00
0x0f, 0x13, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
0xbf, 0xa1, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
0x07, 0x01, 0x00, 0x00, 0xfc, 0xff, 0xff, 0xff
0xb7, 0x02, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00
0x85, 0x00, 0x00, 0x00, 0xe8, 0x6a, 0xff, 0xff
0x61, 0xa1, 0xfc, 0xff, 0x00, 0x00, 0x00, 0x00
0x63, 0x16, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00
0xbf, 0x61, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
0xb7, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
0x85, 0x00, 0x00, 0x00, 0x64, 0xa4, 0x02, 0x00
0xb7, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
0x95, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
-----------------[ jited ]-----------------
0xfc, 0x1c, 0xc8, 0xb7, 0xfc, 0x1c, 0x88, 0xb3
0xfc, 0x1c, 0xc8, 0xb3, 0xfc, 0x1c, 0x08, 0xb4
0xfc, 0x1c, 0x48, 0xb4, 0x0a, 0x20, 0x80, 0x0f
0x6e, 0x82, 0x00, 0x7e, 0x0a, 0x21, 0x80, 0x0f
0x00, 0x00, 0x00, 0x00, 0x8a, 0x22, 0x02, 0x0a
0x8a, 0x23, 0x00, 0x00, 0x8a, 0x24, 0x00, 0x00
0x8a, 0x25, 0x00, 0x00, 0xfc, 0x1c, 0xc8, 0xb5
0xfc, 0x1c, 0x88, 0xb5, 0x0a, 0x22, 0x80, 0x1f
0xfb, 0x80, 0x90, 0x75, 0x22, 0x20, 0x80, 0x02
0x00, 0x24, 0x9c, 0x3f, 0x00, 0x00, 0x08, 0x00
0x0a, 0x20, 0x00, 0x10, 0x0a, 0x21, 0x40, 0x10
0x0a, 0x26, 0x00, 0x12, 0x0a, 0x27, 0x40, 0x12
0x8a, 0x22, 0x00, 0x10, 0x8a, 0x23, 0x00, 0x10
0xcc, 0x27, 0xc0, 0x92, 0xcc, 0x26, 0x81, 0x92
0x58, 0x01, 0x01, 0x00, 0xfc, 0x1c, 0xc8, 0xb5
0xfc, 0x1c, 0x88, 0xb5, 0x0a, 0x22, 0x80, 0x1f
0xf8, 0x80, 0x8c, 0x39, 0x22, 0x20, 0x80, 0x02
0x00, 0x24, 0x9c, 0x3f, 0x00, 0x00, 0x08, 0x00
0x0a, 0x20, 0x00, 0x10, 0x0a, 0x21, 0x40, 0x10
0x0a, 0x20, 0x00, 0x22, 0x0a, 0x21, 0x40, 0x22
0xfc, 0x1c, 0xc8, 0xb5, 0xfc, 0x1c, 0x88, 0xb5
0x0a, 0x22, 0x80, 0x1f, 0xfa, 0x80, 0x5c, 0xda
0x22, 0x20, 0x80, 0x02, 0x00, 0x24, 0x9c, 0x3f
0x00, 0x00, 0x08, 0x00, 0x0a, 0x20, 0x00, 0x10
0x0a, 0x21, 0x40, 0x10, 0x41, 0x29, 0x08, 0x10
0x8a, 0x21, 0x00, 0x10, 0x00, 0x1e, 0x00, 0x12
0x8a, 0x20, 0x09, 0x05, 0x8a, 0x21, 0x00, 0x00
0x00, 0x20, 0x10, 0xa0, 0x01, 0x21, 0x51, 0x20
0x0a, 0x20, 0xc0, 0x06, 0x8a, 0x21, 0x00, 0x00
0x8a, 0x22, 0x3f, 0x1c, 0x8a, 0x23, 0xff, 0x1f
0x00, 0x20, 0x80, 0x82, 0x01, 0x21, 0xc1, 0x02
0x8a, 0x22, 0x00, 0x02, 0x8a, 0x23, 0x00, 0x00
0x0a, 0x24, 0x00, 0x04, 0x0a, 0x25, 0x40, 0x04
0xfc, 0x1c, 0xc8, 0xb5, 0xfc, 0x1c, 0x88, 0xb5
0x0a, 0x22, 0x80, 0x1f, 0xf8, 0x80, 0xa0, 0x3e
0x22, 0x20, 0x80, 0x02, 0x00, 0x24, 0x9c, 0x3f
0x00, 0x00, 0x08, 0x00, 0x0a, 0x20, 0x00, 0x10
0x0a, 0x21, 0x40, 0x10, 0x8a, 0x20, 0x09, 0x04
0x8a, 0x21, 0x00, 0x00, 0xf0, 0x13, 0x04, 0xb0
0xf4, 0x13, 0x05, 0xb0, 0x00, 0x24, 0x04, 0x80
0x01, 0x25, 0x45, 0x00, 0x0a, 0x20, 0xc0, 0x06
0x8a, 0x21, 0x00, 0x00, 0x8a, 0x22, 0x3f, 0x1f
0x8a, 0x23, 0xff, 0x1f, 0x00, 0x20, 0x80, 0x82
0x01, 0x21, 0xc1, 0x02, 0x8a, 0x22, 0x00, 0x01
0x8a, 0x23, 0x00, 0x00, 0xfc, 0x1c, 0xc8, 0xb5
0xfc, 0x1c, 0x88, 0xb5, 0x0a, 0x22, 0x80, 0x1f
0xf8, 0x80, 0xa0, 0x3e, 0x22, 0x20, 0x80, 0x02
0x00, 0x24, 0x9c, 0x3f, 0x00, 0x00, 0x08, 0x00
0x0a, 0x20, 0x00, 0x10, 0x0a, 0x21, 0x40, 0x10
0xfc, 0x13, 0x00, 0xb0, 0x8a, 0x21, 0x00, 0x00
0x04, 0x1e, 0x00, 0x10, 0x0a, 0x20, 0x80, 0x03
0x0a, 0x21, 0xc0, 0x03, 0x8a, 0x22, 0x00, 0x00
0x8a, 0x23, 0x00, 0x00, 0xfc, 0x1c, 0xc8, 0xb5
0xfc, 0x1c, 0x88, 0xb5, 0x0a, 0x22, 0x80, 0x1f
0xfb, 0x80, 0x1c, 0x78, 0x22, 0x20, 0x80, 0x02
0x00, 0x24, 0x9c, 0x3f, 0x00, 0x00, 0x08, 0x00
0x0a, 0x20, 0x00, 0x10, 0x0a, 0x21, 0x40, 0x10
0x8a, 0x20, 0x00, 0x10, 0x8a, 0x21, 0x00, 0x10
0x04, 0x14, 0x11, 0x34, 0x04, 0x14, 0x10, 0x34
0x04, 0x14, 0x0f, 0x34, 0x04, 0x14, 0x0e, 0x34
0x04, 0x14, 0x1f, 0x34, 0x0a, 0x20, 0x00, 0x02
0x0a, 0x21, 0x40, 0x02, 0xe0, 0x20, 0xc0, 0x07

-----------------[  VM   ]-----------------
0xb7, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
0x95, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
-----------------[ jited ]-----------------
0x8a, 0x20, 0x00, 0x10, 0x8a, 0x21, 0x00, 0x10
0x0a, 0x20, 0x00, 0x02, 0x0a, 0x21, 0x40, 0x02
0xe0, 0x20, 0xc0, 0x07

-----------------[  VM   ]-----------------
0xb7, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
0x95, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
-----------------[ jited ]-----------------
0x8a, 0x20, 0x00, 0x10, 0x8a, 0x21, 0x00, 0x10
0x0a, 0x20, 0x00, 0x02, 0x0a, 0x21, 0x40, 0x02
0xe0, 0x20, 0xc0, 0x07

TIME     EVENT PID     PPID

Oops
Path: /usr/sbin/sshd
CPU: 0 PID: 213 Comm: sshd Not tainted 5.18.0-rc4-740838-gd4c699142987 foss-for-synopsys-dwc-arc-processors/toolchain#3
Invalid Write @ 0x000000c5 by insn @ copy_from_kernel_nofault+0x4a/0xe0
ECR: 0x00050200 EFA: 0x000000c5 ERET: 0x80fe307a
STAT: 0xc0080a02 [IE K     ]   BTA: 0x80fe3046
 SP: 0x8247be48  FP: 0x000000d5 BLK: copy_from_kernel_nofault+0x16/0xe0
LPS: 0x8115731c LPE: 0x81157320 LPC: 0x00000000
r00: 0x821b5000 r01: 0x821b5000 r02: 0x00000001
r03: 0x00000000 r04: 0x823d6c54 r05: 0xffffffff
r06: 0x816ea4bc r07: 0x82159138 r08: 0x000000d5
r09: 0x00000000 r10: 0x80f83ea0 r11: 0xffffffff
r12: 0x00000000 r13: 0x80ef464c r14: 0x400c9348
r15: 0x400e4700 r16: 0x400e488c r17: 0x400e4898
r18: 0x00000000 r19: 0x400e4898 r20: 0x00000000
r21: 0x00000000 r22: 0x00000000 r23: 0x00000000
r24: 0x00000000 r25: 0x00000000

Stack Trace:
  copy_from_kernel_nofault+0x4a/0xe0
  bpf_probe_read_kernel+0x12/0x3c
  bpf_prog_0fe2a87411794d37_handle_exec+0x11c/0x1f0
shahab-vahedi commented 5 months ago

Here's a reduced example ring-bug.tar.gz

This reduced example makes kernel 5.x crash, but for 6.x, we get:

# ./ring
libbpf: loading object 'ring_bpf' from buffer
libbpf: elf: section(3) tp/sched/sched_process_exec, size 264, link 0, flags 6, type=1
libbpf: sec 'tp/sched/sched_process_exec': found program 'handle_exec' at insn offset 0 (0 bytes), code size 33 insns (264 bytes)
libbpf: elf: section(4) .reltp/sched/sched_process_exec, size 16, link 23, flags 0, type=9
libbpf: elf: section(5) license, size 13, link 0, flags 3, type=1
libbpf: license of ring_bpf is Dual BSD/GPL
libbpf: elf: section(6) .rodata, size 8, link 0, flags 2, type=1
libbpf: elf: section(7) .maps, size 48, link 0, flags 3, type=1
libbpf: elf: section(14) .BTF, size 11868, link 0, flags 0, type=1
libbpf: elf: section(16) .BTF.ext, size 332, link 0, flags 0, type=1
libbpf: elf: section(23) .symtab, size 360, link 1, flags 0, type=2
libbpf: looking for externs among 15 symbols...
libbpf: collected 0 externs total
libbpf: map 'rb': at sec_idx 7, offset 0.
libbpf: map 'rb': found type = 27.
libbpf: map 'rb': found max_entries = 262144.
libbpf: map 'exec_start': at sec_idx 7, offset 16.
libbpf: map 'exec_start': found type = 1.
libbpf: map 'exec_start': found key [14], sz = 4.
libbpf: map 'exec_start': found value [17], sz = 8.
libbpf: map 'exec_start': found max_entries = 8192.
libbpf: map 'ring_bpf.rodata' (global data): at sec_idx 6, offset 0, flags 480.
libbpf: map 2 is "ring_bpf.rodata"
libbpf: sec '.reltp/sched/sched_process_exec': collecting relocation for section(3) 'tp/sched/sched_process_exec'
libbpf: sec '.reltp/sched/sched_process_exec': relo #0: insn #0 against 'rb'
libbpf: prog 'handle_exec': found map 0 (rb, sec 7, off 0) for insn #0
libbpf: loading kernel BTF '/sys/kernel/btf/vmlinux': 0
libbpf: map 'rb': created successfully, fd=4
libbpf: map 'exec_start': created successfully, fd=5
libbpf: map 'ring_bpf.rodata': created successfully, fd=6
libbpf: sec 'tp/sched/sched_process_exec': found 2 CO-RE relocations
libbpf: CO-RE relocating [34] struct task_struct: found target candidate [1] struct task_struct in [vmlinux]
libbpf: prog 'handle_exec': relo #0: <byte_off> [34] struct task_struct.real_parent (0:66 @ offset 856)
libbpf: prog 'handle_exec': relo #0: matching candidate #0 <byte_off> [1] struct task_struct.real_parent (0:66 @ offset 584)
libbpf: prog 'handle_exec': relo #0: patched insn #12 (ALU/ALU64) imm 856 -> 584
libbpf: prog 'handle_exec': relo #1: <byte_off> [34] struct task_struct.tgid (0:65 @ offset 852)
libbpf: prog 'handle_exec': relo #1: matching candidate #0 <byte_off> [1] struct task_struct.tgid (0:65 @ offset 580)
libbpf: prog 'handle_exec': relo #1: patched insn #19 (ALU/ALU64) imm 852 -> 580
libbpf: failed to re-mmap() map 'ring_bpf.rodata': -22
Failed to load and verify BPF skeleton
shahab-vahedi commented 5 months ago

This occurs around:

/* Remap anonymous mmap()-ed "map initialization image" as
 * a BPF map-backed mmap()-ed memory, but preserving the same
 * memory address. This will cause kernel to change process'
 * page table to point to a different piece of kernel memory,
 * but from userspace point of view memory address (and its
 * contents, being identical at this point) will stay the
 * same. This mapping will be released by bpf_object__close()
 * as per normal clean up procedure, so we don't need to worry
 * about it from skeleton's clean up perspective.
 */
*mmaped = mmap(map->mmaped, mmap_sz, prot, MAP_SHARED | MAP_FIXED, map_fd, 0);
if (*mmaped == MAP_FAILED) {
    err = -errno;
    *mmaped = NULL;
    pr_warn("failed to re-mmap() map '%s': %d\n",
         bpf_map__name(map), err);
    return libbpf_err(err);
}

Snippet above is from line 13320 of libbpf.c in kernel v6.7

shahab-vahedi commented 5 months ago

I used the static libbpf.a in my final binary that comes from libbpf-1.3.0. If I use the shared version, it ends up using the /usr/lib/libbpf.so.1.1.0 in the linux image, and then the bpf program does load into the memory and hit the bug. So the upshot is:

libbpf 1.1.0 --> load and crash
libbpf 1.3.0 --> refuses to load due to some re-mmap() issue
kolerov commented 5 months ago

@shahab-vahedi Don't use -latomic while building binaries - this somehow poisons a binary (maybe it impacts alignment of sections).

shahab-vahedi commented 5 months ago

@kolerov First and foremost, thanks for your tip about -latomic. As strange as it is, it unblocked me and allowed further debugging.

The problem is related to using the ARC's fp register, while it is not initialised properly by a mov fp, sp. It is not initialised because the FP usage logic in arc_analyze_reg_usage() was not sound enough to take care of it.

I have replaced the depth calculation code with the one provided by the BPF verifier (aux->stack_depth):

 static void analyze_reg_usage(struct jit_context *ctx)
 {
        size_t i;
        u32 usage = 0;
-       s16 size = 0;   /* Will be "min()"ed against negative numbers. */
        const struct bpf_insn *insn = ctx->prog->insnsi;

        for (i = 0; i < ctx->prog->len; i++) {
@@ -257,21 +252,10 @@ static void analyze_reg_usage(struct jit_context *ctx)
                bpf_reg = insn[i].dst_reg;
                call = (insn[i].code == (BPF_JMP | BPF_CALL)) ? true : false;
                usage |= mask_for_used_regs(bpf_reg, call);
-
-               /* Is FP usage in the form of "*(FP + -off) = data"? */
-               if (bpf_reg == BPF_REG_FP) {
-                       const u8 store_mem_mask = 0x67;
-                       const u8 code_mask = insn[i].code & store_mem_mask;
-                       if (code_mask == (BPF_ST  | BPF_MEM) ||
-                           code_mask == (BPF_STX | BPF_MEM)) {
-                               /* Then, record the deepest "off"set. */
-                               size = min(size, insn[i].off);
-                       }
-               }
        }

        ctx->arc_regs_clobbered = usage;
-       ctx->frame_size = abs(size);
+       ctx->frame_size = ctx->prog->aux->stack_depth;
 }

@kolerov please try the latest branch and confirm if this fixes the problem for you.

kolerov commented 5 months ago

@shahab-vahedi Now bootstrap examples starts and does not crash. However, events are passed through ring buffer with missing information:

# ./bootstrap
TIME     EVENT COMM             PID     PPID    FILENAME/EXIT CODE
01:03:23 EXEC                   201     170
01:07:12 EXEC                   202     108
01:07:14 EXEC                   204     202
01:07:14 EXEC                   204     202

The problem is in the definition of event structure in bootstrap.h:

struct event {
        int pid;
        int ppid;
        unsigned exit_code;
        unsigned long long duration_ns;
        char comm[TASK_COMM_LEN];
        char filename[MAX_FILENAME_LEN];
        bool exit_event;
};

All fields after long long field are read incorrectly. Moving duration_ns field to the end of the structure allows to obtain data correctly:

# ./bootstrap
TIME     EVENT COMM             PID     PPID    FILENAME/EXIT CODE
01:13:28 EXEC  ls               236     170     /bin/ls
01:13:29 EXIT  ls               236     170     [0] (144617218213ms)

However, incorrect data is read from duration_ns. Here is output for the host machine:

17:53:19 EXEC  sh               2297257 574857  /bin/sh
17:53:19 EXEC  grep             2297259 2297257 /usr/bin/grep
17:53:19 EXEC  ls               2297258 2297257 /usr/bin/ls
17:53:19 EXIT  grep             2297259 2297257 [0] (7ms)
17:53:19 EXIT  ls               2297258 2297257 [2] (5ms)
17:53:19 EXIT  sh               2297257 574857  [0] (9ms)

I suppose that this issue is somehow related to reading/writing 64-bit types in eBPF.

shahab-vahedi commented 5 months ago

@shahab-vahedi Don't use -latomic while building binaries - this somehow poisons a binary (maybe it impacts alignment of sections).

Using the -latomic flag in the build command adds information about it in the final binary:

$ arc-linux-readelf -d ring.static | head
  Dynamic section at offset 0xc1ef0 contains 28 entries:
    Tag        Type                         Name/Value
   0x00000001 (NEEDED)                     Shared library: [libatomic.so.1]
   0x00000001 (NEEDED)                     Shared library: [libc.so.6]

If one were to pass the /absolute/path/to/toolchain/sysroot/usr/lib/libatomic.a on the build command instead of the -latomic, then the binary will be accepted by the verifier.

$ arc-linux-readelf -d ring.static | head
  Dynamic section at offset 0xc1ef0 contains 28 entries:
    Tag        Type                         Name/Value
   0x00000001 (NEEDED)                     Shared library: [libc.so.6]
shahab-vahedi commented 5 months ago

The crashing issue is resolved. Closing this thread and opening a new one regarding the missing data.