p4lang / p4c

P4_16 reference compiler
https://p4.org/
Apache License 2.0
679 stars 443 forks source link

p4-ebpf permission denied: func 'map_initializer' doesn't have 1-th argument: invalid bpf_context #4310

Open scsullivan12 opened 10 months ago

scsullivan12 commented 10 months ago

When attempting to utilize cilium's pwru --filter-trace-tc option with a p4-ebpf psa program loaded using nikss-ctl, I get the below error. The p4-ebpf psa program compiles and loads without issue. Running pwru --filter-trace-tc when the p4-ebpf program is not loaded runs as expected. p4-ebpf c/bc/o files attached.

out-gtpu-ecmp.zip

$ sudo ./pwru --filter-trace-tc
2023/12/22 10:59:20 Failed to load objects: Verifier error: load program: permission denied:
        func#0 @0
        func#1 @38
        func#2 @854
        func#3 @861
        reg type unsupported for arg#0 function fentry_tc#1
        0: R1=ctx(off=0,imm=0) R10=fp0
        ; int BPF_PROG(fentry_tc, struct sk_buff *skb) {
        0: (bf) r6 = r1                       ; R1=ctx(off=0,imm=0) R6_w=ctx(off=0,imm=0)
        ; int BPF_PROG(fentry_tc, struct sk_buff *skb) {
        1: (79) r7 = *(u64 *)(r6 +0)
        func 'map_initializer' doesn't have 1-th argument
        invalid bpf_context access off=0 size=8
        processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0

program fentry_tc: load program: permission denied: func 'map_initializer' doesn't have 1-th argument: invalid bpf_context access off=0 size=8 (11 line(s) omitted)

I see the same exact error on the following two systems:

- Ubuntu 23: Linux pwru-dev 6.5.0-14-generic #14-Ubuntu SMP PREEMPT_DYNAMIC Tue Nov 14 14:59:49 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

- Oracle Linux 9: Linux p4-dev 5.15.0-200.131.27.1.el9uek.x86_64 #2 SMP Wed Nov 22 18:01:23 PST 2023 x86_64 x86_64 x86_64 GNU/Linux

For context, I was experimenting with pwru to see if I could trace a packet through a p4-ebpf processing pipeline. Similar to ovs-appctl ofproto/trace, except with live traffic.

tatry commented 10 months ago

Looks like the problem lies between nikss-ctl and pwru, not the p4c.

If look at the pwru how it loads BPF probe you can see that it tries to attach probe to every TC program. At this point everything is OK.

Now look at the list of loaded BPF programs. There is a program with name map_initializer. This is a special program to initialize some objects, it takes no arguments by design. It is executed only once - when pipeline is loaded, later it is not needed. In general, it is OK that program exists, but could be removed by nikss-ctl.

The problem arise when pwru tries to attach probe to map_initializer, because it expects that every TC program is a regular TC filter, but it is not true here.

As a ~solution~ workaround just remove program map_initializer (but not the whole pipeline) using rm (path to the program you can obtain by passing option -f to bpftool) and then use pwru.

scsullivan12 commented 10 months ago

Thanks for looking into this Jan.

When I remove program map_initializer (using rm), I get the following pwru error:

./pwru --filter-trace-tc

2024/01/02 10:31:38 Failed to load objects: Verifier error: load program: invalid argument: func#0 @0 func#1 @38 func#2 @854 func#3 @861 Subprog map_initializer doesn't exist processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0

program fentry_tc: load program: invalid argument: Subprog map_initializer doesn't exist (5 line(s) omitted)

Prior to removing program map_initializer:

/usr/share/bcc/introspection/bps 511

  BID TYPE                 UID  #MAPS LoadTime     NAME
  511 sched cls              0      2 Jan02/10:40  map_initializer

 MID TYPE            FLAGS         KeySz  ValueSz  MaxEnts NAME
 489 array           0x0               4        4        1      ingress_tbl_mac
 490 array           0x0               4     8192     1      crc_lookup_tbl
tatry commented 10 months ago

I'm not sure why there is still reference to map_initializer.

What happens if you also remove both tables, ingress_tbl_mac and crc_lookup_tbl? (This is safe operation, you will only decrement reference counter. Only the side effect is that ingress_tbl_mac will be invisible for nikss-ctl, but should be possible to fix it later with bpftool.)

Another option is just to add the missing argument for map_initializer, but it is possible to break nikss-ctl. This can be done via C code update - add parameter struct __sk_buff *skb to a function map_initializer in section classifier/map-initializer (in your case line 490).

Unfortunately you can't just remove subprogram map_initializer from C source code (lines 489-519), because it makes important map initialization for CRC32 and default action for ingress_tbl_mac. But if your code use CRC16 instead and a NoAction for ingress_tbl_mac, removing program from C code could be possible.

scsullivan12 commented 10 months ago

For testing purposes, I commented out map_initializer completely and got the following:

./pwru --filter-trace-tc
2024/01/04 17:39:47 Failed to load objects: Verifier error: load program: invalid argument:
    func#0 @0
    func#1 @38
    func#2 @854
    func#3 @861
    Subprog xdp_func doesn't exist
    processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0

program fentry_tc: load program: invalid argument: Subprog xdp_func doesn't exist (5 line(s) omitted)

So then I commented out xdp_func completely, and it ran until I got program fentry_tc: load program: operation not supported: func bpf_get_func_ip#173 not supported for program type 3 (730 line(s) omitted). That function is not supported in my kernel: 5.15.0-201.135.6.el9uek.x86_64, so that's explainable.

I then tried the same thing with both map_initializer and xdp_func commented out with kernel: 6.5.0-14-generic. I get the following:

./pwru --filter-trace-tc
2024/01/04 17:46:32 Failed to load objects: Verifier error: load program: invalid argument:
        func#0 @0
        func#1 @38
        func#2 @854
        func#3 @861
        Subprog tc_ingress_func doesn't exist
        processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0

program fentry_tc: load program: invalid argument: Subprog tc_ingress_func doesn't exist (5 line(s) omitted)

tc_ingress_fuct does exist

rm@pwru-dev:~/Documents/pwru$ sudo bps
      BID TYPE                 UID  #MAPS LoadTime     NAME
        2 tracing                0      1 Jan04/17:33  hid_tail_call
        5 cgroup_device          0      0 Jan04/17:33  sd_devices
        6 cgroup skb             0      0 Jan04/17:33  sd_fw_egress
        7 cgroup skb             0      0 Jan04/17:33  sd_fw_ingress
        8 cgroup skb             0      0 Jan04/17:33  sd_fw_egress
        9 cgroup skb             0      0 Jan04/17:33  sd_fw_ingress
       10 cgroup_device          0      0 Jan04/17:33  sd_devices
       11 cgroup skb             0      0 Jan04/17:33  sd_fw_egress
       12 cgroup skb             0      0 Jan04/17:33  sd_fw_ingress
       13 cgroup_device          0      0 Jan04/17:33  sd_devices
       14 cgroup_device          0      0 Jan04/17:33  sd_devices
       15 cgroup_device          0      0 Jan04/17:33  sd_devices
       16 cgroup skb             0      0 Jan04/17:33  sd_fw_egress
       17 cgroup skb             0      0 Jan04/17:33  sd_fw_ingress
       18 cgroup_device          0      0 Jan04/17:33  sd_devices
       19 cgroup_device          0      0 Jan04/17:33  sd_devices
       20 cgroup skb             0      0 Jan04/17:33  sd_fw_egress
       21 cgroup skb             0      0 Jan04/17:33  sd_fw_ingress
       32 cgroup skb             0      0 Jan04/17:33  sd_fw_egress
       33 cgroup skb             0      0 Jan04/17:33  sd_fw_ingress
       36 cgroup_device       1000      1 Jan04/17:34
       40 sched cls              0     21 Jan04/17:34  tc_ingress_func
       41 sched cls              0      1 Jan04/17:34  tc_egress_func

Your other suggestions resulted in same/similar results. I don't want to waste people's time on this, but do you think there is value in bringing this to celium/pwru's attention?

tatry commented 10 months ago

@scsullivan12

do you think there is value in bringing this to celium/pwru's attention?

Hard to say if there is a such value, but you can try.

brb commented 2 weeks ago

x-posting a potential solution in case anyone wants to take a stab at it - https://github.com/cilium/pwru/issues/404#issuecomment-2249913943