faucetsdn / faucet

FAUCET is an OpenFlow controller for multi table OpenFlow 1.3 switches, that implements layer 2 switching, VLANs, ACLs, and layer 3 IPv4 and IPv6 routing.
http://faucet.nz
Apache License 2.0
555 stars 190 forks source link

Multi Mirroring ends up with unintentional final flows in some cases #3334

Closed naknakshota closed 4 years ago

naknakshota commented 4 years ago

Currently FaucetUntaggedMultiMirrorTest calls tcpdumphelper to expect 8 packets received by filter which is the correct expectation of packets received if it calls tcpdumphelper's execute function once.

I tried adding some debug prints to the result of tcpdump's execute function and it seems like it's being called three separate times

*****tcpdump helper 21:50:59.709425 f6:36:48:9f:31:26 > 46:79:ec:30:85:19, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 32742, offset 0, flags [DF], proto ICMP (1), length 84)10.0.0.1 > 10.0.0.2: ICMP echo request, id 744, seq 1, length 6421:50:59.711576 46:79:ec:30:85:19 > f6:36:48:9f:31:26, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 39848, offset 0, flags [none], proto ICMP (1), length 84)10.0.0.2 > 10.0.0.1: ICMP echo reply, id 744, seq 1, length 642 packets captured2 packets received by filter0 packets dropped by kernel **************

*****tcpdump helper 21:51:09.832863 46:79:ec:30:85:19 > f6:36:48:9f:31:26, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 41422, offset 0, flags [DF], proto ICMP (1), length 84)10.0.0.2 > 10.0.0.1: ICMP echo request, id 751, seq 1, length 6421:51:09.834978 f6:36:48:9f:31:26 > 46:79:ec:30:85:19, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 32818, offset 0, flags [none], proto ICMP (1), length 84)10.0.0.1 > 10.0.0.2: ICMP echo reply, id 751, seq 1, length 642 packets captured2 packets received by filter0 packets dropped by kernel **************

*****tcpdump helper 21:51:19.980934 f6:36:48:9f:31:26 > 46:79:ec:30:85:19, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 33708, offset 0, flags [DF], proto ICMP (1), length 84)10.0.0.1 > 10.0.0.2: ICMP echo request, id 762, seq 1, length 6421:51:19.983063 46:79:ec:30:85:19 > f6:36:48:9f:31:26, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 43671, offset 0, flags [none], proto ICMP (1), length 84)10.0.0.2 > 10.0.0.1: ICMP echo reply, id 762, seq 1, length 6421:51:19.987396 46:79:ec:30:85:19 > f6:36:48:9f:31:26, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 43672, offset 0, flags [DF], proto ICMP (1), length 84)10.0.0.2 > 10.0.0.1: ICMP echo request, id 763, seq 1, length 6421:51:19.989565 f6:36:48:9f:31:26 > 46:79:ec:30:85:19, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 33709, offset 0, flags [none], proto ICMP (1), length 84)10.0.0.1 > 10.0.0.2: ICMP echo reply, id 763, seq 1, length 644 packets captured4 packets received by filter0 packets dropped by kernel **************

The packets received always sum to 8 which is the correct result (tested on separate environment as well) but it split up into 2, 2, and 4 packets captured.

Here's the test result including the debug statement. mirror_test_result2.txt

anarkiwi commented 4 years ago

Thanks for the extra detail. You are correct the helper is called 3 times, but 2 of those are for the proceeding verify_ping_mirrored() check, which passes. So that's expected. If you run your test in software only mode, you should also see it run 3 times, but the last time it should show 8 packets.

If it shows 8 packets on software then it is a real test failure on hardware because the number of times tcpdump helper is run is the same in both cases.

If it does not pass on software mode either, then something odd is going on in the test environment and I would need to see the entire /var/tmp/ test logs.

rahul217 commented 4 years ago

Digging into this further reveals a corner case issue for packets being sent twice unintentionally to the same mirror port.

Corresponding faucet.yaml

     interfaces:
       1:
         native_vlan: vlan400
       2:
         native_vlan: vlan400
       3:
         mirror: [1,2]
         output_only: true 

Here is the trace for a version of the MultiMirror test. (Sending packets from port 1 to port 2)

Flow: ip,in_port=1,vlan_tci=0x0000,dl_src=08:00:27:5c:4e:d5,dl_dst=08:00:27:70:a1:72,nw_src=0.0.0.0,nw_dst=0.0.0.0,nw_proto=0,nw_tos=0,nw_ecn=0,nw_ttl=0

Rule: table=0 cookie=0x5adc15c0 priority=9000,in_port=1,vlan_tci=0x0000/0x1fff
OpenFlow actions=output:3,push_vlan:0x8100,set_field:4496->vlan_vid,goto_table:1

        Resubmitted flow: ip,in_port=1,dl_vlan=400,dl_vlan_pcp=0,dl_src=08:00:27:5c:4e:d5,dl_dst=08:00:27:70:a1:72,nw_src=0.0.0.0,nw_dst=0.0.0.0,nw_proto=0,nw_tos=0,nw_ecn=0,nw_ttl=0
        Resubmitted regs: reg0=0x0 reg1=0x0 reg2=0x0 reg3=0x0 reg4=0x0 reg5=0x0 reg6=0x0 reg7=0x0 reg8=0x0 reg9=0x0 reg10=0x0 reg11=0x0 reg12=0x0 reg13=0x0 reg14=0x0 reg15=0x0
        Resubmitted  odp: 5
        Resubmitted megaflow: recirc_id=0,ip,in_port=1,vlan_tci=0x0000,dl_src=08:00:27:5c:4e:d5,nw_frag=no
        Rule: table=1 cookie=0x5adc15c0 priority=8191,in_port=1,dl_vlan=400,dl_src=08:00:27:5c:4e:d5
        OpenFlow actions=goto_table:2

                Resubmitted flow: unchanged
                Resubmitted regs: reg0=0x0 reg1=0x0 reg2=0x0 reg3=0x0 reg4=0x0 reg5=0x0 reg6=0x0 reg7=0x0 reg8=0x0 reg9=0x0 reg10=0x0 reg11=0x0 reg12=0x0 reg13=0x0 reg14=0x0 reg15=0x0
                Resubmitted  odp: 5
                Resubmitted megaflow: recirc_id=0,ip,in_port=1,vlan_tci=0x0000,dl_src=08:00:27:5c:4e:d5,dl_dst=08:00:27:70:a1:72,nw_frag=no
                Rule: table=2 cookie=0x5adc15c0 priority=8192,dl_vlan=400,dl_dst=08:00:27:70:a1:72
                OpenFlow actions=pop_vlan,output:3,output:2

Final flow: unchanged
Megaflow: recirc_id=0,ip,in_port=1,vlan_tci=0x0000,dl_src=08:00:27:5c:4e:d5,dl_dst=08:00:27:70:a1:72,nw_frag=no
Datapath actions: 3,3,2

We would end up with a new flow that wants to send the packet twice to the same mirror port. Hope this helps!

anarkiwi commented 4 years ago

So what is happening, is we have a hook to mirror packets destined to a host that is on a port, that we want to mirror. So the extra mirror action is in the eth_dst table. This catches the "mirror packets, from a port that is not mirrored to a port that is mirrored" case.

The problem is that the eth_dst doesn't "know" that the packet has already been mirrored because it's from a mirrored port, because it doesn't match on in_port (we don't match on in_port there because the rest of the time we don't need to, and that also make eth_dst an exact match does it doesn't have to go in TCAM which is great for scaling).

If we add an in_port match (which have to be wildcard) that would cause eth_dst to lose the exact match property which is bad.

I'll have a think about how to address this