Open scsullivan12 opened 10 months ago
Looks like the problem lies between nikss-ctl and pwru, not the p4c.
If look at the pwru how it loads BPF probe you can see that it tries to attach probe to every TC program. At this point everything is OK.
Now look at the list of loaded BPF programs. There is a program with name map_initializer
. This is a special program to initialize some objects, it takes no arguments by design. It is executed only once - when pipeline is loaded, later it is not needed. In general, it is OK that program exists, but could be removed by nikss-ctl.
The problem arise when pwru tries to attach probe to map_initializer
, because it expects that every TC program is a regular TC filter, but it is not true here.
As a ~solution~ workaround just remove program map_initializer
(but not the whole pipeline) using rm
(path to the program you can obtain by passing option -f to bpftool) and then use pwru.
Thanks for looking into this Jan.
When I remove program map_initializer (using rm), I get the following pwru error:
2024/01/02 10:31:38 Failed to load objects: Verifier error: load program: invalid argument: func#0 @0 func#1 @38 func#2 @854 func#3 @861 Subprog map_initializer doesn't exist processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
program fentry_tc: load program: invalid argument: Subprog map_initializer doesn't exist (5 line(s) omitted)
Prior to removing program map_initializer:
BID TYPE UID #MAPS LoadTime NAME
511 sched cls 0 2 Jan02/10:40 map_initializer
MID TYPE FLAGS KeySz ValueSz MaxEnts NAME
489 array 0x0 4 4 1 ingress_tbl_mac
490 array 0x0 4 8192 1 crc_lookup_tbl
I'm not sure why there is still reference to map_initializer
.
What happens if you also remove both tables, ingress_tbl_mac
and crc_lookup_tbl
? (This is safe operation, you will only decrement reference counter. Only the side effect is that ingress_tbl_mac
will be invisible for nikss-ctl, but should be possible to fix it later with bpftool.)
Another option is just to add the missing argument for map_initializer
, but it is possible to break nikss-ctl. This can be done via C code update - add parameter struct __sk_buff *skb
to a function map_initializer
in section classifier/map-initializer
(in your case line 490).
Unfortunately you can't just remove subprogram map_initializer
from C source code (lines 489-519), because it makes important map initialization for CRC32 and default action for ingress_tbl_mac
. But if your code use CRC16 instead and a NoAction for ingress_tbl_mac
, removing program from C code could be possible.
For testing purposes, I commented out map_initializer
completely and got the following:
./pwru --filter-trace-tc
2024/01/04 17:39:47 Failed to load objects: Verifier error: load program: invalid argument:
func#0 @0
func#1 @38
func#2 @854
func#3 @861
Subprog xdp_func doesn't exist
processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
program fentry_tc: load program: invalid argument: Subprog xdp_func doesn't exist (5 line(s) omitted)
So then I commented out xdp_func
completely, and it ran until I got program fentry_tc: load program: operation not supported: func bpf_get_func_ip#173 not supported for program type 3 (730 line(s) omitted).
That function is not supported in my kernel: 5.15.0-201.135.6.el9uek.x86_64
, so that's explainable.
I then tried the same thing with both map_initializer
and xdp_func
commented out with kernel: 6.5.0-14-generic
. I get the following:
./pwru --filter-trace-tc
2024/01/04 17:46:32 Failed to load objects: Verifier error: load program: invalid argument:
func#0 @0
func#1 @38
func#2 @854
func#3 @861
Subprog tc_ingress_func doesn't exist
processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
program fentry_tc: load program: invalid argument: Subprog tc_ingress_func doesn't exist (5 line(s) omitted)
tc_ingress_fuct does exist
rm@pwru-dev:~/Documents/pwru$ sudo bps
BID TYPE UID #MAPS LoadTime NAME
2 tracing 0 1 Jan04/17:33 hid_tail_call
5 cgroup_device 0 0 Jan04/17:33 sd_devices
6 cgroup skb 0 0 Jan04/17:33 sd_fw_egress
7 cgroup skb 0 0 Jan04/17:33 sd_fw_ingress
8 cgroup skb 0 0 Jan04/17:33 sd_fw_egress
9 cgroup skb 0 0 Jan04/17:33 sd_fw_ingress
10 cgroup_device 0 0 Jan04/17:33 sd_devices
11 cgroup skb 0 0 Jan04/17:33 sd_fw_egress
12 cgroup skb 0 0 Jan04/17:33 sd_fw_ingress
13 cgroup_device 0 0 Jan04/17:33 sd_devices
14 cgroup_device 0 0 Jan04/17:33 sd_devices
15 cgroup_device 0 0 Jan04/17:33 sd_devices
16 cgroup skb 0 0 Jan04/17:33 sd_fw_egress
17 cgroup skb 0 0 Jan04/17:33 sd_fw_ingress
18 cgroup_device 0 0 Jan04/17:33 sd_devices
19 cgroup_device 0 0 Jan04/17:33 sd_devices
20 cgroup skb 0 0 Jan04/17:33 sd_fw_egress
21 cgroup skb 0 0 Jan04/17:33 sd_fw_ingress
32 cgroup skb 0 0 Jan04/17:33 sd_fw_egress
33 cgroup skb 0 0 Jan04/17:33 sd_fw_ingress
36 cgroup_device 1000 1 Jan04/17:34
40 sched cls 0 21 Jan04/17:34 tc_ingress_func
41 sched cls 0 1 Jan04/17:34 tc_egress_func
Your other suggestions resulted in same/similar results. I don't want to waste people's time on this, but do you think there is value in bringing this to celium/pwru's attention?
@scsullivan12
do you think there is value in bringing this to celium/pwru's attention?
Hard to say if there is a such value, but you can try.
x-posting a potential solution in case anyone wants to take a stab at it - https://github.com/cilium/pwru/issues/404#issuecomment-2249913943
When attempting to utilize cilium's pwru
--filter-trace-tc
option with a p4-ebpf psa program loaded using nikss-ctl, I get the below error. The p4-ebpf psa program compiles and loads without issue. Runningpwru --filter-trace-tc
when the p4-ebpf program is not loaded runs as expected. p4-ebpf c/bc/o files attached.out-gtpu-ecmp.zip
I see the same exact error on the following two systems:
For context, I was experimenting with pwru to see if I could trace a packet through a p4-ebpf processing pipeline. Similar to ovs-appctl ofproto/trace, except with live traffic.