oxidecomputer / p4

A P4 compiler
Mozilla Public License 2.0
111 stars 5 forks source link

Fix decapsulation corruption. #18

Closed rcgoodfellow closed 1 year ago

rcgoodfellow commented 1 year ago

There has been a report of header corruption for geneve packets that are being decapsulated by sidecar-lite. The purpose of this PR is to isolate and fix the issue.

Report Details

We see the following form dtrace probes running on the host system. The first probe shows the packet that sidecar-lite sees on ingress. The second probe shows what sidecar-lite produces on egress. The transformation that happens is the expected decapsulation.

 17  10608 _ZN62_$LT$sidecar_lite..main_pipeline$u20$as$u20$p4rs..Pipeline$GT$14process_packet17h7ad9a0975a60c923E:parser_accepted
ethernet: dst 0 src 0 ether_type 86dd
sidecar: ∅
arp: ∅
ipv4: ∅
ipv6: version 60 traffic_class 0 flow_label 0 payload_len 5a next_hdr 11 hop_limit ff src fd001122334401010000000000000001 dst fd000099000000000000000000000001
icmp: ∅
tcp: ∅
udp: src_port 1e61 dst_port 17c1 len 5a checksum 0
geneve: version 0 opt_len 0 ctrl 0 crit 0 reserved 0 protocol 6558 vni 63 reserved2 0
inner_eth: dst a84025f99999 src a84025f6f994 ether_type 800
inner_ipv4: version 40 ihl 5 diffserv 0 total_len 3c identification c89b flags 0 frag_offset 0 ttl 40 protocol 11 hdr_checksum 96de src a5500d3 dst 8080808
inner_ipv6: ∅
inner_tcp: ∅
inner_udp: src_port 8de8 dst_port 35 len 28 checksum 6450

 17  10606 _ZN62_$LT$sidecar_lite..main_pipeline$u20$as$u20$p4rs..Pipeline$GT$14process_packet17h7ad9a0975a60c923E:ingress_accepted
ethernet: dst 68d79a1f77a1 src a84025f6f994 ether_type 800
sidecar: ∅
arp: ∅
ipv4: version 40 ihl 5 diffserv 0 total_len 3c identification c89b flags 0 frag_offset 0 ttl 40 protocol 11 hdr_checksum 96de src a5500d3 dst 8080808
ipv6: ∅
icmp: ∅
tcp: ∅
udp: src_port 8de8 dst_port 35 len 28 checksum 6450
geneve: ∅
inner_eth: ∅
inner_ipv4: ∅
inner_ipv6: ∅
inner_tcp: ∅
inner_udp: ∅

However, what comes out the interface attached to the softnpu-standalone instance is a corrupt packet as observed by snoop.

workstation-1 ➜  omicron git:(integrate-softnpu) ✗ pfexec snoop -rvd sc0_1 'host 10.85.0.211'
Using device sc0_1 (promiscuous mode)
ETHER:  ----- Ether Header -----
ETHER:
ETHER:  Packet 1 arrived at 0:01:51.10931
ETHER:  Packet size = 74 bytes
ETHER:  Destination = 68:d7:9a:1f:77:a1,
ETHER:  Source      = a8:40:25:f6:f9:94,
ETHER:  Ethertype = 0800 (IP)
ETHER:
IP:   ----- IP Header -----
IP:
IP:   Version = 4
IP:   Header length = 0 bytes
IP:   Type of service = 0x00
IP:         xxx. .... = 0 (precedence)
IP:         ...0 .... = normal delay
IP:         .... 0... = normal throughput
IP:         .... .0.. = normal reliability
IP:         .... ..0. = not ECN capable transport
IP:         .... ...0 = no ECN congestion experienced
IP:   Total length = 60 bytes
IP:   Identification = 53634
IP:   Flags = 0x0
IP:         .0.. .... = may fragment
IP:         ..0. .... = last fragment
IP:   Fragment offset = 0 bytes
IP:   Time to live = 64 seconds/hops
IP:   Protocol = 17 (UDP)
IP:   Header checksum = 8df7
IP:   Source address = 10.85.0.211, 10.85.0.211
IP:   Destination address = 8.8.8.8, 8.8.8.8
IP:   No options
IP:
UDP:  ----- UDP Header -----
UDP:
UDP:  Source port = 16384
UDP:  Destination port = 60
UDP:  Length = 53634 (Not all data contained in this fragment)
UDP:  Checksum = 0000 (no checksum)
UDP:

ETHER:  ----- Ether Header -----
ETHER:
ETHER:  Packet 2 arrived at 0:01:56.10719
ETHER:  Packet size = 74 bytes
ETHER:  Destination = 68:d7:9a:1f:77:a1,
ETHER:  Source      = a8:40:25:f6:f9:94,
ETHER:  Ethertype = 0800 (IP)
ETHER:
IP:   ----- IP Header -----
IP:
IP:   Version = 4
IP:   Header length = 0 bytes
IP:   Type of service = 0x00
IP:         xxx. .... = 0 (precedence)
IP:         ...0 .... = normal delay
IP:         .... 0... = normal throughput
IP:         .... .0.. = normal reliability
IP:         .... ..0. = not ECN capable transport
IP:         .... ...0 = no ECN congestion experienced
IP:   Total length = 60 bytes
IP:   Identification = 53796
IP:   Flags = 0x0
IP:         .0.. .... = may fragment
IP:         ..0. .... = last fragment
IP:   Fragment offset = 0 bytes
IP:   Time to live = 64 seconds/hops
IP:   Protocol = 17 (UDP)
IP:   Header checksum = 8d55
IP:   Source address = 10.85.0.211, 10.85.0.211
IP:   Destination address = 8.8.8.8, 8.8.8.8
IP:   No options
IP:
UDP:  ----- UDP Header -----
UDP:
UDP:  Source port = 16384
UDP:  Destination port = 60
UDP:  Length = 53796 (Not all data contained in this fragment)
UDP:  Checksum = 0000 (no checksum)
UDP:

CC: @internet-diglett

rcgoodfellow commented 1 year ago

The test in the initial commit is working as expected. Next I'm going to try to make the packets look exactly like they do in the report to see what happens.

rcgoodfellow commented 1 year ago

I've got repro on this with 186844c

ry@masaka:~/src/p4$ RUST_BACKTRACE=1 cargo test geneve_decap -- --nocapture
[snip]
running 1 test
Decapped IP: Ipv4Packet { version : 0, header_length : 0, dscp : 0, ecn : 0, total_length : 35, identification : 0, flags : 0, fragment_offset : 0, ttl : 0, next_level_protocol : IpNextHeaderProtocol(17), checksum : 0, source : 10.0.0.1, destination : 8.8.8.8, options : [],  }
Decapped UDP: UdpPacket { source : 47, destination : 74, length : 0, checksum : 99,  }
thread 'decap::geneve_decap' panicked at 'assertion failed: `(left == right)`
  left: `Ipv4Packet { version : 0, header_length : 5, dscp : 0, ecn : 0, total_length : 35, identification : 0, flags : 0, fragment_offset : 0, ttl : 0, next_level_protocol : IpNextHeaderProtocol(17), checksum : 0, source : 10.0.0.1, destination : 8.8.8.8, options : [],  }`,
 right: `Ipv4Packet { version : 0, header_length : 0, dscp : 0, ecn : 0, total_length : 35, identification : 0, flags : 0, fragment_offset : 0, ttl : 0, next_level_protocol : IpNextHeaderProtocol(17), checksum : 0, source : 10.0.0.1, destination : 8.8.8.8, options : [],  }`', test/src/decap.rs:106:5
stack backtrace:
   0: rust_begin_unwind
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:65:14
   2: core::panicking::assert_failed_inner
   3: core::panicking::assert_failed
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:203:5
   4: tests::decap::geneve_decap
             at ./src/decap.rs:106:5
   5: tests::decap::geneve_decap::{{closure}}
             at ./src/decap.rs:16:1
   6: core::ops::function::FnOnce::call_once
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/ops/function.rs:251:5
   7: core::ops::function::FnOnce::call_once
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/ops/function.rs:251:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
test decap::geneve_decap ... FAILED

failures:

failures:
    decap::geneve_decap

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 10 filtered out; finished in 0.05s
rcgoodfellow commented 1 year ago

explicitly setting the header length to 5 has no effect

if (hdr.inner_ipv4.isValid()) {
    hdr.ipv4 = hdr.inner_ipv4;
    hdr.ipv4.ihl = 4w5;
    hdr.ipv4.setValid();
    hdr.ipv6.setInvalid();
    hdr.inner_ipv4.setInvalid();
}

so this appears to be an issue with writing out the ipv4 header - e.g. sub-byte field serialization is not working right in this case.