OpenCloudOS / nettrace

nettrace is a eBPF-based tool to trace network packet and diagnose network problem.
Other
326 stars 80 forks source link

What's wrong with “ERROR: failed to load kprobe-based eBPF” #34

Closed xuinsd45sd closed 1 year ago

xuinsd45sd commented 1 year ago

Hi, I download nettrace-1.2.3-1.tl3.aarch64.tar.bz2 and set up on my machine ( kernel 5.8), I have enabled the config CONFIG_DEBUG_INFO_BTF=y.

I meet the followling problem. Can anyone tell me what I should do to run nettrace correctly?

nettrace --drop

ERROR: failed to load kprobe-based eBPF ERROR: failed to load kprobe-based bpf

menglongdong commented 1 year ago

Hi, I download nettrace-1.2.3-1.tl3.aarch64.tar.bz2 and set up on my machine ( kernel 5.8), I have enabled the config CONFIG_DEBUG_INFO_BTF=y.

I meet the followling problem. Can anyone tell me what I should do to run nettrace correctly?

nettrace --drop

ERROR: failed to load kprobe-based eBPF ERROR: failed to load kprobe-based bpf

Hi, you can add '--debug' to see more debug info, and then let's see what happening.

xuinsd45sd commented 1 year ago

Here are the detailed debug info, Thanks!

nettrace --drop --debug

DEBUG: command: cat /sys/kernel/debug/tracing/events/skb/kfree_skb/format | grep NOT_SPECIFIED WARN: skb drop reason is not support by your kernel, drop reason will not be printed DEBUG: nft high version: 0 libbpf: load bpf program failed: Invalid argument libbpf: -- BEGIN DUMP LOG --- libbpf: Type info disagrees with actual arguments due to compiler optimizations ; DEFINE_KPROBE(ipt_do_table, 1) 0: (bf) r6 = r1 ; DEFINE_KPROBE(ipt_do_table, 1) 1: (79) r7 = (u64 )(r6 +0) ; struct nf_hook_state state = (void )PT_REGS_PARM2(ctx); 2: (79) r3 = (u64 )(r6 +8) ; struct xt_table table = (void )PT_REGS_PARM3(ctx); 3: (79) r8 = (u64 )(r6 +16) 4: (b7) r1 = 0 ; nf_event_t e = { 5: (6b) (u16 )(r10 -8) = r1 6: (7b) (u64 )(r10 -16) = r1 7: (7b) (u64 )(r10 -24) = r1 8: (7b) (u64 )(r10 -32) = r1 9: (7b) (u64 )(r10 -40) = r1 10: (7b) (u64 )(r10 -48) = r1 11: (7b) (u64 )(r10 -56) = r1 12: (7b) (u64 )(r10 -64) = r1 13: (7b) (u64 )(r10 -72) = r1 14: (7b) (u64 )(r10 -80) = r1 15: (7b) (u64 )(r10 -88) = r1 16: (7b) (u64 )(r10 -96) = r1 17: (7b) (u64 )(r10 -104) = r1 18: (7b) (u64 )(r10 -112) = r1 19: (7b) (u64 )(r10 -120) = r1 20: (7b) (u64 )(r10 -128) = r1 21: (7b) (u64 )(r10 -136) = r1 22: (85) call unknown#195896080 invalid func unknown#195896080 processed 23 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0

libbpf: -- END LOG -- libbpf: failed to load program '__trace_ipt_do_table' libbpf: failed to load object 'kprobe_core' libbpf: failed to load BPF skeleton 'kprobe_core': -4007 WARN: failed to load skel: kprobe_core libbpf: load bpf program failed: Invalid argument libbpf: -- BEGIN DUMP LOG --- libbpf: Type info disagrees with actual arguments due to compiler optimizations ; DEFINE_KPROBE(ipt_do_table, 1) 0: (bf) r6 = r1 ; DEFINE_KPROBE(ipt_do_table, 1) 1: (79) r7 = (u64 )(r6 +0) ; struct nf_hook_state state = (void )PT_REGS_PARM2(ctx); 2: (79) r3 = (u64 )(r6 +8) ; struct xt_table table = (void )PT_REGS_PARM3(ctx); 3: (79) r8 = (u64 )(r6 +16) 4: (b7) r1 = 0 ; nf_event_t e = { 5: (6b) (u16 )(r10 -8) = r1 6: (7b) (u64 )(r10 -16) = r1 7: (7b) (u64 )(r10 -24) = r1 8: (7b) (u64 )(r10 -32) = r1 9: (7b) (u64 )(r10 -40) = r1 10: (7b) (u64 )(r10 -48) = r1 11: (7b) (u64 )(r10 -56) = r1 12: (7b) (u64 )(r10 -64) = r1 13: (7b) (u64 )(r10 -72) = r1 14: (7b) (u64 )(r10 -80) = r1 15: (7b) (u64 )(r10 -88) = r1 16: (7b) (u64 )(r10 -96) = r1 17: (7b) (u64 )(r10 -104) = r1 18: (7b) (u64 )(r10 -112) = r1 19: (7b) (u64 )(r10 -120) = r1 20: (7b) (u64 )(r10 -128) = r1 21: (7b) (u64 )(r10 -136) = r1 22: (85) call unknown#195896080 invalid func unknown#195896080 processed 23 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0

libbpf: -- END LOG -- libbpf: failed to load program '__trace_ipt_do_table' libbpf: failed to load object 'kprobe' libbpf: failed to load BPF skeleton 'kprobe': -4007 WARN: failed to load skel: kprobe ERROR: failed to load kprobe-based eBPF ERROR: failed to load kprobe-based bpf

menglongdong commented 1 year ago

Looks weird......However, I think the latest code may have fix this problem, as we now only load and attach the programs we need.

I'll release a new version base on the latest, and we can see if this problem exist then.

menglongdong commented 1 year ago

Hello, you can try the latest release, and let's see if it works.

zhonglin6666 commented 1 year ago

root@node1:~# cat /etc/issue Ubuntu 20.04.6 LTS \n \l

root@node1:~# uname -a Linux node1 5.4.0-164-generic #181-Ubuntu SMP Fri Sep 1 13:41:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

nettrace --version

version: 1.2.6.tl3

the same issue!

; if (handle_entry(ctx))
83: (67) r0 <<= 32
84: (77) r0 >>= 32
; if (handle_entry(ctx))
85: (55) if r0 != 0x0 goto pc+103
86: (b7) r1 = 0
; if (bpf_core_type_exists(struct nft_pktinfo)) {
87: (7b) *(u64 *)(r10 -264) = r8
88: (15) if r1 == 0x0 goto pc+8
; if (!bpf_core_field_exists(pkt->xt))
97: <invalid CO-RE relocation>
failed to resolve CO-RE relocation <byte_off> [1436] struct xt_action_param.state (0:2 @ offset 16)
processed 414 insns (limit 1000000) max_states_per_insn 0 total_states 33 peak_states 33 mark_read 30
-- END PROG LOAD LOG --
libbpf: prog '__trace_nft_do_chain': failed to load: -22
libbpf: failed to load object 'kprobe'
libbpf: failed to load BPF skeleton 'kprobe': -22
ERROR: failed to load kprobe-based eBPF
ERROR: failed to load kprobe-based bpf
menglongdong commented 1 year ago

root@node1:~# cat /etc/issue Ubuntu 20.04.6 LTS \n \l

root@node1:~# uname -a Linux node1 5.4.0-164-generic #181-Ubuntu SMP Fri Sep 1 13:41:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

nettrace --version

version: 1.2.6.tl3

the same issue!

; if (handle_entry(ctx))
83: (67) r0 <<= 32
84: (77) r0 >>= 32
; if (handle_entry(ctx))
85: (55) if r0 != 0x0 goto pc+103
86: (b7) r1 = 0
; if (bpf_core_type_exists(struct nft_pktinfo)) {
87: (7b) *(u64 *)(r10 -264) = r8
88: (15) if r1 == 0x0 goto pc+8
; if (!bpf_core_field_exists(pkt->xt))
97: <invalid CO-RE relocation>
failed to resolve CO-RE relocation <byte_off> [1436] struct xt_action_param.state (0:2 @ offset 16)
processed 414 insns (limit 1000000) max_states_per_insn 0 total_states 33 peak_states 33 mark_read 30
-- END PROG LOAD LOG --
libbpf: prog '__trace_nft_do_chain': failed to load: -22
libbpf: failed to load object 'kprobe'
libbpf: failed to load BPF skeleton 'kprobe': -22
ERROR: failed to load kprobe-based eBPF
ERROR: failed to load kprobe-based bpf

Sadly, this should be a different problem. Seems ubuntu20 doesn't handle CO-RE properly, and didn't skip the dead code, which it should.

Anyway, I'll test more to find a solution.

Thanks for the reporting......

menglongdong commented 1 year ago

root@node1:~# cat /etc/issue Ubuntu 20.04.6 LTS \n \l

root@node1:~# uname -a Linux node1 5.4.0-164-generic #181-Ubuntu SMP Fri Sep 1 13:41:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

nettrace --version

version: 1.2.6.tl3

the same issue!

; if (handle_entry(ctx))
83: (67) r0 <<= 32
84: (77) r0 >>= 32
; if (handle_entry(ctx))
85: (55) if r0 != 0x0 goto pc+103
86: (b7) r1 = 0
; if (bpf_core_type_exists(struct nft_pktinfo)) {
87: (7b) *(u64 *)(r10 -264) = r8
88: (15) if r1 == 0x0 goto pc+8
; if (!bpf_core_field_exists(pkt->xt))
97: <invalid CO-RE relocation>
failed to resolve CO-RE relocation <byte_off> [1436] struct xt_action_param.state (0:2 @ offset 16)
processed 414 insns (limit 1000000) max_states_per_insn 0 total_states 33 peak_states 33 mark_read 30
-- END PROG LOAD LOG --
libbpf: prog '__trace_nft_do_chain': failed to load: -22
libbpf: failed to load object 'kprobe'
libbpf: failed to load BPF skeleton 'kprobe': -22
ERROR: failed to load kprobe-based eBPF
ERROR: failed to load kprobe-based bpf

It's my fault......The CO-RE is enable globally in vmlinux.h by default, which is unnecessary for the probe read we used here.

Disable it can solve this problem, and I'll send a MR later.