OpenCloudOS / nettrace

nettrace is a eBPF-based tool to trace network packet and diagnose network problem.
Other
326 stars 80 forks source link

ebpf无法加载,报failed to load kprobe-based eBPF #58

Closed hezhiye closed 1 year ago

hezhiye commented 1 year ago

Hi: 我使用release的x86版本nettrace, 在运行./nettrace -p icmp --detail --diag --date --debug时出现 DEBUG: prog: trace_ipv4_confirm is made no-autoload DEBUG: prog: trace_nf_confirm is made no-autoload DEBUG: prog: trace_nf_confirm is made no-autoload DEBUG: prog: trace_ipv4_conntrack_in is made no-autoload DEBUG: prog: trace_ipv4_conntrack_in is made no-autoload DEBUG: prog: __trace_nf_conntrack_in is made no-autoload DEBUG: prog: trace_nf_conntrack_in is made no-autoload DEBUG: prog: trace_ipv4_pkt_to_tuple is made no-autoload DEBUG: prog: trace_ipv4_pkt_to_tuple is made no-autoload DEBUG: prog: trace_tcp_new is made no-autoload DEBUG: prog: trace_tcp_new is made no-autoload DEBUG: prog: trace_tcp_pkt_to_tuple is made no-autoload DEBUG: prog: trace_tcp_pkt_to_tuple is made no-autoload DEBUG: prog: trace_resolve_normal_ct is made no-autoload DEBUG: prog: __trace_resolve_normal_ct is made no-autoload DEBUG: prog: trace_tcp_packet is made no-autoload DEBUG: prog: trace_tcp_packet is made no-autoload DEBUG: prog: trace_tcp_in_window is made no-autoload DEBUG: prog: trace_tcp_in_window is made no-autoload DEBUG: prog: _tracenf_ct_refresh_acct is made no-autoload DEBUG: prog: trace___nf_ct_refresh_acct is made no-autoload DEBUG: prog: trace_ip_finish_output_gso is made no-autoload DEBUG: prog: trace_ip_finish_output_gso is made no-autoload DEBUG: prog: trace_xfrm4_output is made no-autoload DEBUG: prog: trace_xfrm4_output is made no-autoload DEBUG: prog: trace_xfrm_output is made no-autoload DEBUG: prog: trace_xfrm_output is made no-autoload DEBUG: prog: trace_xfrm_output2 is made no-autoload DEBUG: prog: trace_xfrm_output2 is made no-autoload DEBUG: prog: trace_xfrm_output_gso is made no-autoload DEBUG: prog: trace_xfrm_output_gso is made no-autoload DEBUG: prog: __trace_xfrm_output_resume is made no-autoload DEBUG: prog: trace_xfrm_output_resume is made no-autoload DEBUG: prog: trace_xfrm4_transport_output is made no-autoload DEBUG: prog: __trace_xfrm4_transport_output is made no-autoload DEBUG: prog: trace_xfrm4_prepare_output is made no-autoload DEBUG: prog: trace_xfrm4_prepare_output is made no-autoload DEBUG: prog: __trace_xfrm4_policy_check is made no-autoload DEBUG: prog: trace_xfrm4_policy_check is made no-autoload DEBUG: prog: trace_xfrm4_rcv is made no-autoload DEBUG: prog: trace_xfrm4_rcv is made no-autoload DEBUG: prog: trace_xfrm_input is made no-autoload DEBUG: prog: trace_xfrm_input is made no-autoload DEBUG: prog: trace_xfrm4_transport_input is made no-autoload DEBUG: prog: __trace_xfrm4_transport_input is made no-autoload DEBUG: prog: trace_ah_output is made no-autoload DEBUG: prog: trace_ah_output is made no-autoload DEBUG: prog: trace_esp_output is made no-autoload DEBUG: prog: trace_esp_output is made no-autoload DEBUG: prog: trace_esp_output_tail is made no-autoload DEBUG: prog: trace_esp_output_tail is made no-autoload DEBUG: prog: trace_ah_input is made no-autoload DEBUG: prog: trace_ah_input is made no-autoload DEBUG: prog: trace_esp_input is made no-autoload DEBUG: prog: trace_esp_input is made no-autoload DEBUG: prog: trace_xfrm4_udp_encap_rcv is made no-autoload DEBUG: prog: trace_xfrm4_udp_encap_rcv is made no-autoload DEBUG: prog: __trace_xfrm4_rcv_encap is made no-autoload DEBUG: prog: trace_xfrm4_rcv_encap is made no-autoload DEBUG: prog: trace___udp_queue_rcv_skb is made no-autoload DEBUG: prog: trace_udp_queue_rcv_skb is made no-autoload DEBUG: prog: trace_ping_queue_rcv_skb is made no-autoload DEBUG: prog: trace_ping_queue_rcv_skb is made no-autoload libbpf: prog 'trace_ipt_do_table_new': BPF program load failed: Invalid argument libbpf: prog 'trace_ipt_do_table_new': -- BEGIN PROG LOAD LOG -- R1 type=ctx expected=fp 0: R1=ctx(off=0,imm=0) R10=fp0 ; DEFINE_KPROBE_TARGET(ipt_do_table_new, ipt_do_table, 2) 0: (bf) r6 = r1 ; R1=ctx(off=0,imm=0) R6_w=ctx(off=0,imm=0) ; DEFINE_KPROBE_TARGET(ipt_do_table_new, ipt_do_table, 2) 1: (79) r7 = (u64 )(r6 +104) ; R6_w=ctx(off=0,imm=0) R7_w=Pscalar() ; struct xt_table table = (void )PT_REGS_PARM1(ctx); 2: (79) r8 = (u64 )(r6 +112) ; R6_w=ctx(off=0,imm=0) R8_w=Pscalar() ; struct nf_hook_state state = (void )PT_REGS_PARM3(ctx); 3: (79) r3 = (u64 )(r6 +96) ; R3_w=Pscalar() R6_w=ctx(off=0,imm=0) 4: (b7) r1 = 0 ; R1_w=P0 ; bpf_ipt_do_table(); 5: (6b) (u16 )(r10 -8) = r1 ; R1_w=P0 R10=fp0 fp-8=??????00 6: (7b) (u64 )(r10 -16) = r1 ; R1_w=P0 R10=fp0 fp-16_w=00000000 7: (7b) (u64 )(r10 -24) = r1 ; R1_w=P0 R10=fp0 fp-24_w=00000000 8: (7b) (u64 )(r10 -32) = r1 ; R1_w=P0 R10=fp0 fp-32_w=00000000 9: (7b) (u64 )(r10 -40) = r1 ; R1_w=P0 R10=fp0 fp-40_w=00000000 10: (7b) (u64 )(r10 -48) = r1 ; R1_w=P0 R10=fp0 fp-48_w=00000000 11: (7b) (u64 )(r10 -56) = r1 ; R1_w=P0 R10=fp0 fp-56_w=00000000 12: (7b) (u64 )(r10 -64) = r1 ; R1_w=P0 R10=fp0 fp-64_w=00000000 13: (7b) (u64 )(r10 -72) = r1 ; R1_w=P0 R10=fp0 fp-72_w=00000000 14: (7b) (u64 )(r10 -80) = r1 ; R1_w=P0 R10=fp0 fp-80_w=00000000 15: (7b) (u64 )(r10 -88) = r1 ; R1_w=P0 R10=fp0 fp-88_w=00000000 16: (7b) (u64 )(r10 -96) = r1 ; R1_w=P0 R10=fp0 fp-96_w=00000000 17: (7b) (u64 )(r10 -104) = r1 ; R1_w=P0 R10=fp0 fp-104_w=00000000 18: (7b) (u64 )(r10 -112) = r1 ; R1_w=P0 R10=fp0 fp-112_w=00000000 19: (7b) (u64 )(r10 -120) = r1 ; R1_w=P0 R10=fp0 fp-120_w=00000000 20: (7b) (u64 )(r10 -128) = r1 ; R1_w=P0 R10=fp0 fp-128_w=00000000 21: (7b) (u64 )(r10 -136) = r1 ; R1_w=P0 R10=fp0 fp-136_w=00000000 22: (b7) r1 = 0 ; R1_w=P0 23: (0f) r3 += r1 ; R1_w=P0 R3_w=Pscalar() 24: (b7) r1 = 33 ; R1_w=P33 25: (63) (u32 )(r10 -68) = r1 ; R1_w=P33 R10=fp0 fp-72_w=mmmm0000 26: (bf) r1 = r10 ; R1_w=fp0 R10=fp0 ; 27: (07) r1 += -176 ; R1_w=fp-176 ; bpf_ipt_do_table(); 28: (b7) r2 = 4 ; R2_w=P4 29: (85) call bpf_probe_read_kernel#113 ; R0_w=Pscalar() fp-176=????mmmm ; bpf_ipt_do_table(); 30: (61) r1 = (u32 )(r10 -176) ; R1_w=Pscalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0 ; bpf_ipt_do_table(); 31: (73) (u8 )(r10 -8) = r1 ; R1_w=Pscalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0 fp-8_w=P 32: failed to resolve CO-RE relocation [539] struct xt_table.name (0:7 @ offset 56) processed 33 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 -- END PROG LOAD LOG -- libbpf: prog 'trace_ipt_do_table_new': failed to load: -22 libbpf: failed to load object 'kprobe' libbpf: failed to load BPF skeleton 'kprobe': -22 ERROR: failed to load kprobe-based eBPF ERROR: failed to load kprobe-based bpf

我确认了内核配置都是好的 CONFIG_KPROBES=y CONFIG_KPROBES_ON_FTRACE=y CONFIG_HAVE_KPROBES=y CONFIG_HAVE_KPROBES_ON_FTRACE=y CONFIG_KPROBE_EVENTS=y CONFIG_FTRACE=y CONFIG_DYNAMIC_FTRACE=y CONFIG_BPF=y CONFIG_HAVE_EBPF_JIT=y CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y CONFIG_BPF_SYSCALL=y CONFIG_BPF_JIT=y CONFIG_DEBUG_INFO_BTF=y

hezhiye commented 1 year ago

[v1.2.5] x86_64 已经编译好的nettrace, 而且记得刚开始还运行成功过一次

hezhiye commented 1 year ago

最新的master分支,容器编译后运行也是一样的结果

menglongdong commented 1 year ago

最新的代码也是这种报错吗?看样子是CORE找不到xtable相关的结构体。

如果你的内核没有开启(或者不支持)DEBUG_INFO_BTF_MODULES,且CONFIG_NETFILTER_XTABLES没有配置成y,且当前使用的是iptables-legacy,那是有可能出现这种报错的。

这个我做一下兼容吧

hezhiye commented 1 year ago

@xmmgithub 让我佩服得五体投地,确实,iptables 使用了,并且DEBUG_INFO_BTF_MODULES没打开,CONFIG_NETFILTER_XTABLES=m.

hezhiye commented 1 year ago

@xmmgithub 第一次成功是因为我先把nettrace起来再运行iptables, 这样是可以跑起来的

menglongdong commented 1 year ago

@xmmgithub 第一次成功是因为我先把nettrace起来再运行iptables, 这样是可以跑起来的

这样是不能跟踪iptables模块的。我对这个case做了下兼容,麻烦用最新的代码测试一下哈

hezhiye commented 1 year ago

重新编译后,先加上iptables规则,然后再运行nettrace,仍然会有 DEBUG: prog: trace___ping_queue_rcv_skb is made no-autoload DEBUG: prog: trace_ping_queue_rcv_skb is made no-autoload libbpf: prog 'trace_ipt_do_table': BPF program load failed: Invalid argument libbpf: prog 'trace_ipt_do_table': -- BEGIN PROG LOAD LOG -- R1 type=ctx expected=fp 0: R1=ctx(off=0,imm=0) R10=fp0 ; DEFINE_KPROBE_SKB(ipt_do_table, 2) 0: (bf) r7 = r1 ; R1=ctx(off=0,imm=0) R7_w=ctx(off=0,imm=0) ; DEFINE_KPROBE_SKB(ipt_do_table, 2) 1: (7b) (u64 )(r10 -240) = r7 ; R7_w=ctx(off=0,imm=0) R10=fp0 fp-240_w=ctx 2: (79) r1 = (u64 )(r7 +104) ; R1_w=Pscalar() R7_w=ctx(off=0,imm=0) 3: (7b) (u64 )(r10 -232) = r1 ; R1_w=Pscalar() R10=fp0 fp-232_w=mmmmmmmm 4: (b7) r6 = 0 ; R6_w=P0 ; DEFINE_KPROBE_SKB(ipt_do_table, 2) 5: (63) (u32 )(r10 -152) = r6 ; R6_w=P0 R10=fp0 fp-152=????0000 6: (bf) r2 = r10 ; R2_w=fp0 R10=fp0 ; 7: (07) r2 += -152 ; R2_w=fp-152 ; DEFINE_KPROBE_SKB(ipt_do_table, 2) 8: (18) r1 = 0xffff8881413e3000 ; R1_w=map_ptr(off=0,ks=4,vs=152,imm=0) 10: (85) call bpf_map_lookup_elem#1 ; R0_w=map_value_or_null(id=1,off=0,ks=4,vs=152,imm=0) 11: (15) if r0 == 0x0 goto pc+92 ; R0_w=map_value(off=0,ks=4,vs=152,imm=0) 12: (b7) r1 = 33 ; R1_w=P33 ; DEFINE_KPROBE_SKB(ipt_do_table, 2) 13: (6b) (u16 )(r10 -192) = r1 ; R1_w=P33 R10=fp0 fp-192_w=P33 14: (7b) (u64 )(r10 -208) = r6 ; R6_w=P0 R10=fp0 fp-208_w=00000000 15: (7b) (u64 )(r10 -216) = r0 ; R0_w=map_value(off=0,ks=4,vs=152,imm=0) R10=fp0 fp-216_w=map_value ; struct xt_table table = nt_regs_ctx(ctx, 1); 16: (79) r6 = (u64 )(r7 +112) ; R6_w=Pscalar() R7_w=ctx(off=0,imm=0) ; struct nf_hook_state state = nt_regs_ctx(ctx, 3); 17: (79) r3 = (u64 )(r7 +96) ; R3_w=Pscalar() R7_w=ctx(off=0,imm=0) ; DECLARE_EVENT(nf_event_t, e, .hook = _C(state, hook)) 18: (71) r1 = (u8 )(r0 +113) ; R0=map_value(off=0,ks=4,vs=152,imm=0) R1=Pscalar(umax=255,var_off=(0x0; 0xff)) ; DECLARE_EVENT(nf_event_t, e, .hook = _C(state, hook)) 19: (55) if r1 != 0x0 goto pc+32 ; R1=P0 20: (b7) r1 = 0 ; R1_w=P0 21: (7b) (u64 )(r10 -32) = r1 ; R1_w=P0 R10=fp0 fp-32_w=00000000 22: (7b) (u64 )(r10 -40) = r1 ; R1_w=P0 R10=fp0 fp-40_w=00000000 23: (7b) (u64 )(r10 -48) = r1 ; R1_w=P0 R10=fp0 fp-48_w=00000000 24: (7b) (u64 )(r10 -56) = r1 ; R1_w=P0 R10=fp0 fp-56_w=00000000 25: (7b) (u64 )(r10 -64) = r1 ; R1_w=P0 R10=fp0 fp-64_w=00000000 26: (7b) (u64 )(r10 -72) = r1 ; R1_w=P0 R10=fp0 fp-72_w=00000000 27: (7b) (u64 )(r10 -80) = r1 ; R1_w=P0 R10=fp0 fp-80_w=00000000 28: (7b) (u64 )(r10 -88) = r1 ; R1_w=P0 R10=fp0 fp-88_w=00000000 29: (7b) (u64 )(r10 -96) = r1 ; R1_w=P0 R10=fp0 fp-96_w=00000000 30: (7b) (u64 )(r10 -104) = r1 ; R1_w=P0 R10=fp0 fp-104_w=00000000 31: (7b) (u64 )(r10 -112) = r1 ; R1_w=P0 R10=fp0 fp-112_w=00000000 32: (7b) (u64 )(r10 -120) = r1 ; R1_w=P0 R10=fp0 fp-120_w=00000000 33: (7b) (u64 )(r10 -128) = r1 ; R1_w=P0 R10=fp0 fp-128_w=00000000 34: (7b) (u64 )(r10 -136) = r1 ; R1_w=P0 R10=fp0 fp-136_w=00000000 35: (7b) (u64 )(r10 -144) = r1 ; R1_w=P0 R10=fp0 fp-144_w=00000000 36: (7b) (u64 )(r10 -152) = r1 ; R1_w=P0 R10=fp0 fp-152_w=00000000 37: (b7) r1 = 0 ; R1_w=P0 38: (0f) r3 += r1 ; R1_w=P0 R3_w=Pscalar() 39: (bf) r1 = r10 ; R1_w=fp0 R10=fp0 ; 40: (07) r1 += -184 ; R1_w=fp-184 ; DECLARE_EVENT(nf_event_t, e, .hook = _C(state, hook)) 41: (b7) r2 = 4 ; R2_w=P4 42: (85) call bpf_probe_read_kernel#113 ; R0=Pscalar() fp-184=????mmmm 43: (bf) r1 = r10 ; R1_w=fp0 R10=fp0 ; 44: (07) r1 += -152 ; R1_w=fp-152 ; DECLARE_EVENT(nf_event_t, e, .hook = _C(state, hook)) 45: (7b) (u64 )(r10 -224) = r1 ; R1_w=fp-152 R10=fp0 fp-224_w=fp ; DECLARE_EVENT(nf_event_t, e, .hook = _C(state, hook)) 46: (61) r1 = (u32 )(r10 -184) ; R1_w=Pscalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0 ; DECLARE_EVENT(nf_event_t, e, .hook = _C(state, hook)) 47: (73) (u8 )(r10 -32) = r1 ; R1_w=Pscalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0 fp-32_w=P 48: (b7) r1 = 128 ; R1_w=P128 49: (bf) r7 = r10 ; R7_w=fp0 R10=fp0 50: (07) r7 += -48 ; R7_w=fp-48 51: (05) goto pc+34 ; DECLARE_EVENT(nf_event_t, e, .hook = _C(state, hook)) 86: (7b) (u64 )(r10 -200) = r1 ; R1_w=P128 R10=fp0 fp-200_w=P128 87: (b7) r1 = 0 ; R1_w=P0 ; if (bpf_core_type_exists(struct xt_table)) 88: (15) if r1 == 0x0 goto pc+0 ; R1=P0 89: failed to resolve CO-RE relocation [579] struct xt_table.name (0:7 @ offset 56) processed 55 insns (limit 1000000) max_states_per_insn 0 total_states 3 peak_states 3 mark_read 2 -- END PROG LOAD LOG -- libbpf: prog 'trace_ipt_do_table': failed to load: -22 libbpf: failed to load object 'kprobe' libbpf: failed to load BPF skeleton 'kprobe': -22 ERROR: failed to load kprobe-based eBPF ERROR: failed to load kprobe-based bpf

hezhiye commented 1 year ago

执行命令:./nettrace -p icmp --detail --diag --date --debug

menglongdong commented 1 year ago

拉取最新的master代码了吗?请确保代码包含这个提交:

nettrace: compatible with xtables

另外,构建之前确保进行了clean操作

menglongdong commented 1 year ago

你这个内核代码有点奇怪。。。struct xt_table结构体,没有name字段吗?

hezhiye commented 1 year ago

拉取最新的master代码了吗?请确保代码包含这个提交:

nettrace: compatible with xtables

另外,构建之前确保进行了clean操作

已经确认了,已经包含最新代码,clean后再容器方式编译的

hezhiye commented 1 year ago

你这个内核代码有点奇怪。。。struct xt_table结构体,没有name字段吗? linux-5.16/include/linux/netfilter/x_tables.h: struct xt_table { struct list_head list;

    /* What hooks you will enter on */
    unsigned int valid_hooks;

    /* Man behind the curtain... */
    struct xt_table_info *private;

    /* hook ops that register the table with the netfilter core */
    struct nf_hook_ops *ops;

    /* Set this to THIS_MODULE if you are a module, otherwise NULL */
    struct module *me;

    u_int8_t af;            /* address/protocol family */
    int priority;           /* hook order */

    /* A unique name... */
    const char name[XT_TABLE_MAXNAMELEN];

}; 这个?

menglongdong commented 1 year ago

你这个内核代码有点奇怪。。。struct xt_table结构体,没有name字段吗? linux-5.16/include/linux/netfilter/x_tables.h: struct xt_table { struct list_head list;

    /* What hooks you will enter on */
    unsigned int valid_hooks;

    /* Man behind the curtain... */
    struct xt_table_info *private;

    /* hook ops that register the table with the netfilter core */
    struct nf_hook_ops *ops;

    /* Set this to THIS_MODULE if you are a module, otherwise NULL */
    struct module *me;

    u_int8_t af;            /* address/protocol family */
    int priority;           /* hook order */

    /* A unique name... */
    const char name[XT_TABLE_MAXNAMELEN];

}; 这个?

看起来没问题,诡异。。。麻烦将你的内核配置发送到imagedong@tencent.com这个邮箱, 我稍后分析一下原因。

hezhiye commented 1 year ago

/proc/config.gz 这个文件给你就可以了吧?

menglongdong commented 1 year ago

/proc/config.gz 这个文件给你就可以了吧?

是的

hezhiye commented 1 year ago

config.gz 看来这里可以上传我的内核配置

menglongdong commented 1 year ago

config.gz 看来这里可以上传我的内核配置

根据你的这个内核配置编译出来的内核,没有能复现问题。

另外,你给我的内核配置是5.18版本的。上面的代码又是5.16版本的?

linux-5.16/include/linux/netfilter/x_tables.h

hezhiye commented 1 year ago

哦 不好意思,这个发错了,内核确实是5.18的版本,那个发错的是glibc的。这个应该没关系。 那如果我把DEBUG_INFO_BTF_MODULES=y, 是否也不会出现这个问题?

menglongdong commented 1 year ago

哦 不好意思,这个发错了,内核确实是5.18的版本,那个发错的是glibc的。这个应该没关系。 那如果我把DEBUG_INFO_BTF_MODULES=y, 是否也不会出现这个问题?

有可能哦。。。你的BTF_MODULES=n,应该是因为你的pahole版本太低了导致的

hezhiye commented 1 year ago

pahole版本 v1.22

hezhiye commented 1 year ago

觉得应该是这个pahole问题,我yocto编译时,在生成btf最后一步强制指定了pahole路径,最终生成了内核,但是在DEBUG_INFO_BTF_MODULES 和BTF_MODULES判断时没有强制指定了pahole路径,导致为n

hezhiye commented 1 year ago

是这个pahole问题导致