Open blaxie opened 3 years ago
Hi!
There are two problems.
Firstly, you will not be able to make a ping
between Mininet hosts using test_tunneling.p4
program, because this program encapsulates packets into MPLS, which is not stripped out by host and packet is not correctly identified. Also, you need to populate table entries by using ovs-ofctl update-bpf-map
, did you do that?
Secondly, in case of the vlan.c
example there is probably a bug in the compiler related to how the check of packet's length is done. We'll fix it, but to verify that could you modify the C program to have the following piece of code?
if (sizeof(struct ether_header) + sizeof(struct iphdr) < pkt_len) { // instead of if (pkt_len < sizeof(struct ether_header) + sizeof(struct iphdr)) {
return 1;
}
With the above modification you need to re-compile program using clang
and try to ping again.
Hi,
Thanks for getting back to me so quickly.
I made the change to vlan.c
and now the pings are being forwarded properly.
I believe that the issue with my own P4 programs is related to the problem with vlan.c
, since any time I return 1
before the following code in the deparser
block, the pings are forwarded as normal:
int outHeaderOffset = BYTES(outHeaderLength) - BYTES(packetOffsetInBits);
pkt = ubpf_adjust_head(ctx, outHeaderOffset);
With regard to test-tunneling.p4
, I had been running tcpdump to check for MPLS packets, but none had been received. I had not populated the table entries as I thought the program forwarded packets by default, but now that I've gone back to try to do that, I am getting a diffferent error. The BPF program seems to be unloading after I try to add the flow rules, giving me the following error: OFPT_ERROR (xid=0x6): OFPBAC_BAD_ARGUMENT
once I try to add the second flow rule.
I'll reproduce the issue with the verification of a packet's length locally and provide fix to the p4c-ubpf
compiler, if needed.
Regarding OFPT_ERROR (xid=0x6): OFPBAC_BAD_ARGUMENT
are you able to provide logs from OVS? By default logs are stored in /var/log/openvswitch/ovs-vswitchd.log
. Note also that the uBPF verifier is in the experimental stage and if you don't use --disable-bpf-verifier
flag with the ./configure
command, you can encounter some unexpected bugs.
The first set of logs below is when both flow rules are added successfully but the pings fail, while the second set of logs below are when the second flow rule is when OFPT_ERROR (xid=0x6): OFPBAC_BAD_ARGUMENT
is returned. Both of these logs were produced when running test-tunneling.p4
and with the BPF verifier disabled.
2020-11-18T11:09:43Z|00001|ovs_numa|INFO|Discovered 1 CPU cores on NUMA node 0
2020-11-18T11:09:43Z|00002|ovs_numa|INFO|Discovered 1 NUMA nodes and 1 CPU cores
2020-11-18T11:09:43Z|00003|reconnect|INFO|unix:/usr/local/var/run/openvswitch/db.sock: connecting...
2020-11-18T11:09:43Z|00004|reconnect|INFO|unix:/usr/local/var/run/openvswitch/db.sock: connected
2020-11-18T11:09:43Z|00005|bridge|INFO|ovs-vswitchd (Open vSwitch) 2.13.90
2020-11-18T11:09:45Z|00006|pmd_perf|INFO|Estimated TSC frequency: 3192760 KHz
2020-11-18T11:09:45Z|00007|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports recirculation
2020-11-18T11:09:45Z|00008|ofproto_dpif|INFO|netdev@ovs-netdev: VLAN header stack length probed as 1
2020-11-18T11:09:45Z|00009|ofproto_dpif|INFO|netdev@ovs-netdev: MPLS label stack length probed as 3
2020-11-18T11:09:45Z|00010|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports truncate action
2020-11-18T11:09:45Z|00011|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports unique flow ids
2020-11-18T11:09:45Z|00012|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports clone action
2020-11-18T11:09:45Z|00013|ofproto_dpif|INFO|netdev@ovs-netdev: Max sample nesting level probed as 10
2020-11-18T11:09:45Z|00014|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports eventmask in conntrack action
2020-11-18T11:09:45Z|00015|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports ct_clear action
2020-11-18T11:09:45Z|00016|ofproto_dpif|INFO|netdev@ovs-netdev: Max dp_hash algorithm probed to be 1
2020-11-18T11:09:45Z|00017|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports check_pkt_len action
2020-11-18T11:09:45Z|00018|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports timeout policy in conntrack action
2020-11-18T11:09:45Z|00019|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports ct_state
2020-11-18T11:09:45Z|00020|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports ct_zone
2020-11-18T11:09:45Z|00021|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports ct_mark
2020-11-18T11:09:45Z|00022|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports ct_label
2020-11-18T11:09:45Z|00023|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports ct_state_nat
2020-11-18T11:09:45Z|00024|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports ct_orig_tuple
2020-11-18T11:09:45Z|00025|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports ct_orig_tuple6
2020-11-18T11:09:45Z|00026|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports IPv6 ND Extensions
2020-11-18T11:09:45Z|00027|bridge|INFO|bridge s1: added interface s1-eth2 on port 2
2020-11-18T11:09:45Z|00028|bridge|INFO|bridge s1: added interface s1-eth1 on port 1
2020-11-18T11:09:45Z|00029|bridge|INFO|bridge s1: added interface s1 on port 65534
2020-11-18T11:09:45Z|00030|bridge|INFO|bridge s1: using datapath ID 0000000000000001
2020-11-18T11:09:45Z|00031|connmgr|INFO|s1: added service controller "punix:/usr/local/var/run/openvswitch/s1.mgmt"
2020-11-18T11:09:53Z|00032|memory|INFO|17280 kB peak resident set size after 10.0 seconds
2020-11-18T11:09:53Z|00033|memory|INFO|handlers:1 ports:3 revalidators:1 rules:4 udpif keys:2
2020-11-18T11:10:04Z|00034|connmgr|INFO|s1<->unix#3: 1 flow_mods in the last 0 s (1 adds)
2020-11-18T11:10:07Z|00035|connmgr|INFO|s1<->unix#6: 1 flow_mods in the last 0 s (1 adds)
Segmentation fault
2020-11-18T11:26:56Z|00045|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports recirculation
2020-11-18T11:26:56Z|00046|ofproto_dpif|INFO|netdev@ovs-netdev: VLAN header stack length probed as 1
2020-11-18T11:26:56Z|00047|ofproto_dpif|INFO|netdev@ovs-netdev: MPLS label stack length probed as 3
2020-11-18T11:26:56Z|00048|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports truncate action
2020-11-18T11:26:56Z|00049|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports unique flow ids
2020-11-18T11:26:56Z|00050|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports clone action
2020-11-18T11:26:56Z|00051|ofproto_dpif|INFO|netdev@ovs-netdev: Max sample nesting level probed as 10
2020-11-18T11:26:56Z|00052|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports eventmask in conntrack action
2020-11-18T11:26:56Z|00053|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports ct_clear action
2020-11-18T11:26:56Z|00054|ofproto_dpif|INFO|netdev@ovs-netdev: Max dp_hash algorithm probed to be 1
2020-11-18T11:26:56Z|00055|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports check_pkt_len action
2020-11-18T11:26:56Z|00056|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports timeout policy in conntrack action
2020-11-18T11:26:56Z|00057|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports ct_state
2020-11-18T11:26:56Z|00058|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports ct_zone
2020-11-18T11:26:56Z|00059|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports ct_mark
2020-11-18T11:26:56Z|00060|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports ct_label
2020-11-18T11:26:56Z|00061|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports ct_state_nat
2020-11-18T11:26:56Z|00062|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports ct_orig_tuple
2020-11-18T11:26:56Z|00063|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports ct_orig_tuple6
2020-11-18T11:26:56Z|00064|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports IPv6 ND Extensions
2020-11-18T11:26:56Z|00065|bridge|INFO|bridge s1: added interface s1-eth2 on port 2
2020-11-18T11:26:56Z|00066|bridge|INFO|bridge s1: added interface s1-eth1 on port 1
2020-11-18T11:26:56Z|00067|bridge|INFO|bridge s1: added interface s1 on port 65534
2020-11-18T11:26:56Z|00068|bridge|INFO|bridge s1: using datapath ID 0000000000000001
2020-11-18T11:26:56Z|00069|connmgr|INFO|s1: added service controller "punix:/usr/local/var/run/openvswitch/s1.mgmt"
2020-11-18T11:26:58Z|00070|connmgr|INFO|s1<->unix#13: 1 flow_mods in the last 0 s (1 adds)
Segmentation fault
I don't have any experience debugging OvS, but when I attached GDB to the ovs-vswitchd
process, this was all I got.
(gdb) c
Continuing.
[New Thread 0x7f5dca967a00 (LWP 5492)]
[New Thread 0x7f5dca4e0700 (LWP 5509)]
[New Thread 0x7f5dc9cdf700 (LWP 5510)]
[New Thread 0x7f5dc94de700 (LWP 5511)]
[New Thread 0x7f5dbbfff700 (LWP 5515)]
[New Thread 0x7f5dbb7fe700 (LWP 5516)]
Thread 1 "ovs-vswitchd" received signal SIGSEGV, Segmentation fault.
0x00007f5dca99502b in ?? ()
(gdb) bt
#0 0x00007f5dca99502b in ?? ()
#1 0x000055d6dd0dd058 in ?? ()
#2 0x00007ffc04f65580 in ?? ()
#3 0x0000000000000000 in ?? ()
Hi,
Unfortunately, the logs doesn't say too much. Could you re-compile OVS with: ./configure CFLAGS='-g -O2' --disable-bpf-verifier
and debug the issue with gdb
again?
Hi,
Recompling with the debug symbols does not change the output of gdb
. The address referred to at #0
is not mapped, while the addresses at #1
and #2
are mapped to the heap and stack respectively. Sorry I couldn't be of more help with this.
No problem. I'll try to find time to re-produce the issue locally. I'll keep you updated.
Okay, thanks!
@blaxie Could you provide update-bpf-map
commands, which you used to populate table entries?
I hadn't provided any table entries as I wasn't interested in encapsulating or decapsulating any packets when testing the pings. I have since tried to add a table entry using the command sudo ovs-ofctl update-bpf-map s1 1 0 key 2 0 0 10 value 0 0 0 0
. This does not seem to produce any errors. However, the program still fails to forward packets when the table entry has been added, failing in the same manner as before.
If you send raw IP packets you should populate tbl_downstream
, what means that you should use the following command:
ovs-ofctl update-bpf-map s1 1 1 key 2 0 0 10 value 0 0 0 0
With the above command you should be able to receive packets encapsulated into the MPLS header.
EDIT: Just to clarify, the BPF maps are numbered using the order they are defined in the P4 program, so tbl_upstream
has index 0 and tbl_downstream
has index 1.
When I use the command ovs-ofctl update-bpf-map s1 1 1 key 2 0 0 10 value 0 0 0 0
, I get another error:
OFPT_ERROR (xid=0x4): OFPBRC_EPERM
NXT_BPF_UPDATE_MAP (xid=0x4):
As well as that, the ovs-ofctl show-bpf-prog s1 1
command shows that there actually 4 maps loaded and I have no idea why.
NXT_BPF_SHOW_PROG_REPLY (xid=0x4):
id 1: loaded_at=2020-11-24 T11:20:48
map_ids 0,1,2,3
The P4 code is exactly the same as the code you have been discussing, with the tbl_upstream
and tbl_downstream
tables.
Ok, I think I know why you have this problem. The p4rt-ovs
has not been adjusted to the newest version of p4c
yet. The latest PR to the uBPF backend (https://github.com/p4lang/p4c/pull/2572) introduced support for default actions, so that for every P4 table with default action, there will be 2 BPF maps generated. That's why the number of BPF maps equals to 4.
Now.. Taking into account the latest change in p4c
the command you should use to populate tbl_downstream
is as follows:
ovs-ofctl update-bpf-map s1 1 3 key 2 0 0 10 value 0 0 0 0
Let me know if it works.
The table entry seems to be added without any problems when I use ovs-ofctl update-bpf-map s1 1 2 key 2 0 0 10 value 0 0 0 0
, but adding the flow rules after this (e.g. sudo ovs-ofctl add-flow s1 in_port=1,actions=prog:1,output:2
) is still causing the same segmentation fault as before.
Could you first add flow and then insert the BPF map's entry ? Unfortunately, P4rt-OVS is not actively developed anymore and it is not free of bugs.
Probably I've found root cause of the segmentation fault.
P4rt-ovs calls uBPF program in this way: https://github.com/Orange-OpenSource/p4rt-ovs/blob/2511a57e5411aadb824d914281afddbb203fd07c/lib/bpf.h#L39-L45
To the uBPF program packet and its length is passed as parameters.
But recent versions of p4c-ubpf
compilers generates following entry point of uBPF program:
uint64_t entry(void *ctx, struct standard_metadata *std_meta){
void *pkt = ubpf_packet_data(ctx);
uint32_t pkt_len = std_meta->packet_length;
struct headers_t hdr = {
So packet and pointer to struct standard_metadata
is expected.
In other words, packet's length is treated as a pointer to standard_metadata
struct. This must fail, because packet's length isn't a valid pointer. That would explain as well lack of debugging symbols during crash.
Manually edit auto generated entry point to something like this:
uint64_t entry(void *ctx, uint32_t pkt_len){
void *pkt = ubpf_packet_data(ctx);
struct headers_t hdr = {
Changes connected with standard_metadata
were not merged into this repository, so you can use one of these forks:
Probably I've found root cause of the segmentation fault.
P4rt-ovs calls uBPF program in this way:
To the uBPF program packet and its length is passed as parameters.
But recent versions of
p4c-ubpf
compilers generates following entry point of uBPF program:uint64_t entry(void *ctx, struct standard_metadata *std_meta){ void *pkt = ubpf_packet_data(ctx); uint32_t pkt_len = std_meta->packet_length; struct headers_t hdr = {
So packet and pointer to
struct standard_metadata
is expected.In other words, packet's length is treated as a pointer to
standard_metadata
struct. This must fail, because packet's length isn't a valid pointer. That would explain as well lack of debugging symbols during crash.Workaround 1
Manually edit auto generated entry point to something like this:
uint64_t entry(void *ctx, uint32_t pkt_len){ void *pkt = ubpf_packet_data(ctx); struct headers_t hdr = {
Workaround 2
Changes connected with
standard_metadata
were not merged into this repository, so you can use one of these forks:
- https://github.com/osinstom/ovs.p4 (@osinstom which branch? I don't remember)
- https://github.com/tatry/p4rt-ovs/tree/metadata_ubpf
Hello,I tried these two methods, but still getting an error when dump-map.what should i do
When running the mininet script test_topo.py with the test_tunneling.p4 loaded, I am getting a
Destination Host Unreachable
error when attempting to pingh2
fromh1
.This error occurs even after insertion of flow rules using the commands
sudo ovs-ofctl add-flow s1 in_port=1,actions=prog:1,output:2
andsudo ovs-ofctl add-flow s1 in_port=2,actions=prog:1,output:1
as instructed in the README.I am getting the same error when using other programs also (such as vlan.c) and my own P4 program that trivially extracts the ethernet header, passes the packet and deparses the ethernet header again. All of these programs show the same behaviour when compiled with either
clang-6.0
orclang-10
.I have tested the pings without loading any bpf program and they work just fine. The only way that I can seem to get the pings to work is by editing the
C
code of each program to triviallyreturn 1
as seen with vlan.c below where I have commented out most of theentry
function:I realise that this is unlikely to be a bug, but I would appreciate if anyone had some idea as to what I have done wrong, as I believe I have followed all of the installation steps correctly. I you require any more details, I am happy to provide them.