Juniper / contrail-vrouter

Contrail Virtual Router
BSD 2-Clause "Simplified" License
218 stars 171 forks source link

Help with understanding IF Drops and Fragment issues #96

Open nirhenn opened 7 years ago

nirhenn commented 7 years ago

Hi all,

Need help with guide how to troubleshoot IF Drops and Fragment issues. I am running with Opencontrail 3.0.2 w/DPDK. When running a load in L2 inside a single node (all VMs are located on the same compute node) I can notice that once I pass the 18 Gbps traffic I can see IF Drops and Fragment errors and connection to VM get staggering.

I use isolcpu=0-3 on the first NUMA 0 for vrouter-dpdk and hugepages 1G in grub. The strange thing is that i do see the first 4 cores in 100%, almost constantly, but in the process lines I see 10 to 11 routers-dpdk process, but only 4 are in 100 and the rest are 0. x only, is it normal? Also, those processes that are 0.x are not running on core 0-3, is it normal, even if the vrouter-dpdk.ini taskset is set to -c 0,1,2,3?

What are the steps to identify from where the issues source from?

Nir.

nirhenn commented 7 years ago

Some more info on env:

root@node-36:~# dropstats GARP 0 ARP no where to go 0 Invalid ARPs 0

Invalid IF 4 Trap No IF 0 IF TX Discard 0 IF Drop 1651839472 IF RX Discard 0

Flow Unusable 4 Flow No Memory 0 Flow Table Full 0 Flow NAT no rflow 0 Flow Action Drop 421044 Flow Action Invalid 0 Flow Invalid Protocol 0 Flow Queue Limit Exceeded 3027684

Discards 8014 TTL Exceeded 0 Mcast Clone Fail 0 Cloned Original 16

Invalid NH 352 Invalid Label 0 Invalid Protocol 0 Rewrite Fail 0 Invalid Mcast Source 0

Push Fails 0 Pull Fails 0 Duplicated 0 Head Alloc Fails 0 Head Space Reserve Fails 0 PCOW fails 0 Invalid Packets 0

Misc 0 Nowhere to go 0 Checksum errors 0 No Fmd 0 Invalid VNID 0 Fragment errors 359772754 Invalid Source 0 Jumbo Mcast Pkt with DF Bit 0 ARP No Route 0 ARP Reply No Route 0 No L2 Route 15976

VLAN fwd intf failed TX 0 VLAN fwd intf failed enq 0

most if not all errors comes from Statistics for core 10

GARP 0 ARP no where to go 0 Invalid ARPs 0

Invalid IF 0 Trap No IF 0 IF TX Discard 0 IF Drop 852489867 IF RX Discard 0

Flow Unusable 0 Flow No Memory 0 Flow Table Full 0 Flow NAT no rflow 0 Flow Action Drop 1183 Flow Action Invalid 0 Flow Invalid Protocol 0 Flow Queue Limit Exceeded 0

Discards 0 TTL Exceeded 0 Mcast Clone Fail 0 Cloned Original 0

Invalid NH 0 Invalid Label 0 Invalid Protocol 0 Rewrite Fail 0 Invalid Mcast Source 0

Push Fails 0 Pull Fails 0 Duplicated 0 Head Alloc Fails 0 Head Space Reserve Fails 0 PCOW fails 0 Invalid Packets 0

Misc 0 Nowhere to go 0 Checksum errors 0 No Fmd 0 Invalid VNID 0 Fragment errors 0 Invalid Source 0 Jumbo Mcast Pkt with DF Bit 0 ARP No Route 0 ARP Reply No Route 0 No L2 Route 289

VLAN fwd intf failed TX 0 VLAN fwd intf failed enq 0

root@node-36:~# vif -l Vrouter Interface Table

Flags: P=Policy, X=Cross Connect, S=Service Chain, Mr=Receive Mirror Mt=Transmit Mirror, Tc=Transmit Checksum Offload, L3=Layer 3, L2=Layer 2 D=DHCP, Vp=Vhost Physical, Pr=Promiscuous, Vnt=Native Vlan Tagged Mnp=No MAC Proxy, Dpdk=DPDK PMD Interface, Rfl=Receive Filtering Offload, Mon=Interface is Monitored Uuf=Unknown Unicast Flood, Vof=VLAN insert/strip offload

vif0/0 PCI: 0:0:0.0 (Speed 10000, Duplex 1) Type:Physical HWaddr:90:e2:ba:8f:9d:f0 IPaddr:0 Vrf:0 Flags:L3L2Vp MTU:1514 Ref:17 RX device packets:117822 bytes:25471899 errors:1 RX port packets:117822 errors:0 RX queue packets:2698 errors:1108 RX queue errors to lcore 0 0 0 0 0 0 0 0 0 0 0 0 1108 0 RX packets:116714 bytes:24893783 errors:0 TX packets:18969979 bytes:29044258968 errors:0 TX port packets:37567365 errors:0 TX device packets:37567365 bytes:30197296900 errors:0

vif0/1 Virtual: vhost0 Type:Host HWaddr:90:e2:ba:8f:9d:f0 IPaddr:0 Vrf:0 Flags:L3L2 MTU:1514 Ref:10 RX port packets:215901 errors:0 RX queue packets:209491 errors:6362 RX queue errors to lcore 0 0 0 0 0 0 0 0 0 0 0 788 5574 0 RX packets:209539 bytes:230319764 errors:0 TX packets:114016 bytes:24632043 errors:0 TX queue packets:99643 errors:0 TX port packets:114016 errors:0

vif0/2 Socket: unix Type:Agent HWaddr:00:00:5e:00:01:00 IPaddr:0 Vrf:65535 Flags:L3 MTU:1514 Ref:2 RX port packets:2796 errors:0 RX queue errors to lcore 0 0 0 0 0 0 0 0 0 0 0 0 0 0 RX packets:2796 bytes:240456 errors:4 TX packets:8270 bytes:8052699 errors:0 TX queue packets:8270 errors:0 TX port packets:8270 errors:0 syscalls:8278

vif0/4 PMD: tap8dfb4f0d-b4 Type:Virtual HWaddr:00:00:5e:00:01:00 IPaddr:0 Vrf:8 Flags:PL3L2D MTU:9160 Ref:13 RX port packets:1552 errors:0 RX queue packets:1373 errors:11 RX queue errors to lcore 0 0 0 0 0 0 0 0 0 0 0 0 11 0 RX packets:1541 bytes:1482486 errors:0 TX packets:149 bytes:6258 errors:0 TX port packets:149 errors:0 syscalls:149

vif0/5 PMD: tapc9db1a03-ff Type:Virtual HWaddr:00:00:5e:00:01:00 IPaddr:0 Vrf:10 Flags:PL3L2D MTU:9160 Ref:13 RX port packets:1548 errors:0 RX queue packets:1176 errors:209 RX queue errors to lcore 0 0 0 0 0 0 0 0 0 0 0 209 0 0 RX packets:1339 bytes:1212282 errors:0 TX packets:143 bytes:6006 errors:0 TX port packets:143 errors:0 syscalls:143

vif0/6 Ethernet: vetheaa774c7-c Type:Virtual HWaddr:00:00:5e:00:01:00 IPaddr:0 Vrf:6 Flags:PL3L2D MTU:9160 Ref:13 RX port packets:10 errors:0 RX queue packets:2 errors:0 RX queue errors to lcore 0 0 0 0 0 0 0 0 0 0 0 0 0 0 RX packets:10 bytes:476 errors:0 TX packets:10 bytes:420 errors:2 TX queue packets:8 errors:0 TX port packets:8 errors:0

vif0/7 PMD: tap49823e4f-1b Type:Virtual HWaddr:00:00:5e:00:01:00 IPaddr:0 Vrf:10 Flags:PL3L2D MTU:9160 Ref:13 RX port packets:1019377032 errors:0 syscalls:2048 RX queue packets:166893433 errors:852483430 RX queue errors to lcore 0 0 0 0 0 0 0 0 0 0 0 852483422 8 0 RX packets:166893602 bytes:242893359607 errors:0 TX packets:1488 bytes:102392 errors:0 TX port packets:1488 errors:0 syscalls:1427

vif0/8 PMD: tapc6f25dc7-d3 Type:Virtual HWaddr:00:00:5e:00:01:00 IPaddr:0 Vrf:10 Flags:PL3L2D MTU:9160 Ref:13 RX port packets:997707640 errors:0 syscalls:3754 RX queue packets:198359542 errors:799347927 RX queue errors to lcore 0 0 0 0 0 0 0 0 0 0 0 0 799347927 0 RX packets:198359713 bytes:277502123415 errors:0 TX packets:1454 bytes:100652 errors:0 TX port packets:1454 errors:0 syscalls:1397

vif0/9 Ethernet: vethb82ae1d8-8 Type:Virtual HWaddr:00:00:5e:00:01:00 IPaddr:0 Vrf:1 Flags:PL3L2D MTU:9160 Ref:13 RX port packets:10 errors:0 RX queue packets:2 errors:0 RX queue errors to lcore 0 0 0 0 0 0 0 0 0 0 0 0 0 0 RX packets:10 bytes:476 errors:0 TX packets:10 bytes:420 errors:2 TX queue packets:8 errors:0 TX port packets:8 errors:0

vif0/10 PMD: tapdf63a5cb-16 Type:Virtual HWaddr:00:00:5e:00:01:00 IPaddr:0 Vrf:8 Flags:PL3L2D MTU:9160 Ref:13 RX port packets:1552 errors:0 RX queue packets:1333 errors:20 RX queue errors to lcore 0 0 0 0 0 0 0 0 0 0 0 16 4 0 RX packets:1532 bytes:1472004 errors:0 TX packets:146 bytes:6132 errors:0 TX port packets:146 errors:0 syscalls:146

vif0/11 PMD: tapb05a7fdd-04 Type:Virtual HWaddr:00:00:5e:00:01:00 IPaddr:0 Vrf:8 Flags:PL3L2D MTU:9160 Ref:13 RX port packets:1556 errors:0 RX queue packets:1374 errors:30 RX queue errors to lcore 0 0 0 0 0 0 0 0 0 0 0 18 12 0 RX packets:1526 bytes:1478750 errors:0 TX packets:87508103 bytes:131943575895 errors:0 TX port packets:87475103 errors:33000 syscalls:7818654

vif0/13 PMD: tap084f2567-c9 Type:Virtual HWaddr:00:00:5e:00:01:00 IPaddr:0 Vrf:8 Flags:PL3L2D MTU:9160 Ref:13 RX port packets:1545 errors:0 RX queue packets:1297 errors:86 RX queue errors to lcore 0 0 0 0 0 0 0 0 0 0 0 0 86 0 RX packets:1459 bytes:1384826 errors:0 TX packets:143 bytes:6006 errors:0 TX port packets:143 errors:0 syscalls:143

vif0/14 PMD: tape3bc7866-3e Type:Virtual HWaddr:00:00:5e:00:01:00 IPaddr:0 Vrf:9 Flags:PL3L2D MTU:9160 Ref:13 RX port packets:1 errors:0 RX queue errors to lcore 0 0 0 0 0 0 0 0 0 0 0 0 0 0 RX packets:1 bytes:42 errors:0 TX packets:1 bytes:42 errors:0 TX port packets:1 errors:0 syscalls:1

vif0/15 PMD: tapccc14950-83 Type:Virtual HWaddr:00:00:5e:00:01:00 IPaddr:0 Vrf:10 Flags:PL3L2D MTU:9160 Ref:13 RX port packets:1541 errors:0 RX queue packets:1269 errors:112 RX queue errors to lcore 0 0 0 0 0 0 0 0 0 0 0 19 93 0 RX packets:1429 bytes:1363553 errors:0 TX packets:141 bytes:5982 errors:0 TX port packets:141 errors:0 syscalls:141

vif0/16 Ethernet: veth24265d20-d Type:Virtual HWaddr:00:00:5e:00:01:00 IPaddr:0 Vrf:3 Flags:PL3L2D MTU:9160 Ref:13 RX port packets:11 errors:0 RX queue packets:0 errors:3 RX queue errors to lcore 0 0 0 0 0 0 0 0 0 0 0 3 0 0 RX packets:8 bytes:336 errors:0 TX packets:8 bytes:336 errors:0 TX queue packets:8 errors:0 TX port packets:8 errors:0

vif0/17 Ethernet: vethed5dc57a-1 Type:Virtual HWaddr:00:00:5e:00:01:00 IPaddr:0 Vrf:3 Flags:PL3L2D MTU:9160 Ref:13 RX port packets:11 errors:0 RX queue packets:2 errors:1 RX queue errors to lcore 0 0 0 0 0 0 0 0 0 0 0 0 1 0 RX packets:10 bytes:476 errors:0 TX packets:8 bytes:336 errors:0 TX queue packets:8 errors:0 TX port packets:8 errors:0

vif0/18 PMD: tap89771556-14 Type:Virtual HWaddr:00:00:5e:00:01:00 IPaddr:0 Vrf:10 Flags:PL3L2D MTU:9160 Ref:13 RX port packets:1546 errors:0 RX queue packets:1327 errors:23 RX queue errors to lcore 0 0 0 0 0 0 0 0 0 0 0 23 0 0 RX packets:1523 bytes:1471125 errors:0 TX packets:142 bytes:5964 errors:0 TX port packets:142 errors:0 syscalls:142

vif0/19 PMD: tap02760b58-df Type:Virtual HWaddr:00:00:5e:00:01:00 IPaddr:0 Vrf:8 Flags:PL3L2D MTU:9160 Ref:13 RX port packets:1542 errors:0 RX queue packets:1287 errors:95 RX queue errors to lcore 0 0 0 0 0 0 0 0 0 0 0 24 71 0 RX packets:1447 bytes:1385399 errors:0 TX packets:139 bytes:5838 errors:0 TX port packets:139 errors:0 syscalls:139

vif0/20 PMD: tapa5b35a02-3b Type:Virtual HWaddr:00:00:5e:00:01:00 IPaddr:0 Vrf:8 Flags:PL3L2D MTU:9160 Ref:13 RX port packets:1586 errors:0 RX queue packets:1375 errors:42 RX queue errors to lcore 0 0 0 0 0 0 0 0 0 0 0 31 11 0 RX packets:1544 bytes:1481533 errors:0 TX packets:29620743 bytes:44740006989 errors:0 TX port packets:29594077 errors:26667 syscalls:1763294

vif0/21 PMD: tap595588eb-33 Type:Virtual HWaddr:00:00:5e:00:01:00 IPaddr:0 Vrf:10 Flags:PL3L2D MTU:9160 Ref:13 RX port packets:1555 errors:0 RX queue packets:1370 errors:9 RX queue errors to lcore 0 0 0 0 0 0 0 0 0 0 0 0 9 0 RX packets:1546 bytes:1478840 errors:0 TX packets:147 bytes:6174 errors:0 TX port packets:147 errors:0 syscalls:147

root@node-36:~#

root@node-36:~# vrfstats --dump Vrf: 0 Discards 0, Resolves 0, Receives 130769, L2 Receives 0, Vrf Translates 0, Unknown Unicast Floods 0 Ecmp Composites 0, L2 Mcast Composites 0, Fabric Composites 0, Encap Composites 0, Evpn Composites 0 Udp Tunnels 0, Udp Mpls Tunnels 0, Gre Mpls Tunnels 0, Vxlan Tunnels 0 L2 Encaps 0, Encaps 234346 GROs 0, Diags 0 Arp Virtual Proxys 0, Arp Virtual Stitchs 0, Arp Virtual Floods 0, Arp Physical Stitchs 0, Arp Tor Proxys 0, Arp Physical Floods 0

Vrf: 1 Discards 0, Resolves 0, Receives 0, L2 Receives 9, Vrf Translates 0, Unknown Unicast Floods 0 Ecmp Composites 0, L2 Mcast Composites 2, Fabric Composites 0, Encap Composites 0, Evpn Composites 0 Udp Tunnels 0, Udp Mpls Tunnels 0, Gre Mpls Tunnels 0, Vxlan Tunnels 0 L2 Encaps 0, Encaps 0 GROs 0, Diags 0 Arp Virtual Proxys 0, Arp Virtual Stitchs 0, Arp Virtual Floods 0, Arp Physical Stitchs 0, Arp Tor Proxys 0, Arp Physical Floods 0

Vrf: 3 Discards 5397, Resolves 0, Receives 0, L2 Receives 19, Vrf Translates 0, Unknown Unicast Floods 0 Ecmp Composites 0, L2 Mcast Composites 2, Fabric Composites 0, Encap Composites 0, Evpn Composites 0 Udp Tunnels 0, Udp Mpls Tunnels 0, Gre Mpls Tunnels 18758961, Vxlan Tunnels 0 L2 Encaps 0, Encaps 0 GROs 0, Diags 0 Arp Virtual Proxys 0, Arp Virtual Stitchs 0, Arp Virtual Floods 0, Arp Physical Stitchs 0, Arp Tor Proxys 0, Arp Physical Floods 0

Vrf: 6 Discards 0, Resolves 0, Receives 0, L2 Receives 13, Vrf Translates 0, Unknown Unicast Floods 0 Ecmp Composites 0, L2 Mcast Composites 2, Fabric Composites 0, Encap Composites 0, Evpn Composites 0 Udp Tunnels 0, Udp Mpls Tunnels 0, Gre Mpls Tunnels 0, Vxlan Tunnels 0 L2 Encaps 0, Encaps 0 GROs 0, Diags 0 Arp Virtual Proxys 0, Arp Virtual Stitchs 0, Arp Virtual Floods 0, Arp Physical Stitchs 0, Arp Tor Proxys 0, Arp Physical Floods 0

Vrf: 8 Discards 0, Resolves 0, Receives 0, L2 Receives 1266, Vrf Translates 0, Unknown Unicast Floods 0 Ecmp Composites 0, L2 Mcast Composites 6, Fabric Composites 0, Encap Composites 0, Evpn Composites 0 Udp Tunnels 0, Udp Mpls Tunnels 0, Gre Mpls Tunnels 0, Vxlan Tunnels 0 L2 Encaps 977, Encaps 283 GROs 0, Diags 0 Arp Virtual Proxys 977, Arp Virtual Stitchs 0, Arp Virtual Floods 0, Arp Physical Stitchs 0, Arp Tor Proxys 0, Arp Physical Floods 0

Vrf: 9 Discards 0, Resolves 0, Receives 0, L2 Receives 244, Vrf Translates 0, Unknown Unicast Floods 0 Ecmp Composites 0, L2 Mcast Composites 0, Fabric Composites 0, Encap Composites 0, Evpn Composites 0 Udp Tunnels 0, Udp Mpls Tunnels 0, Gre Mpls Tunnels 0, Vxlan Tunnels 0 L2 Encaps 1, Encaps 389 GROs 0, Diags 0 Arp Virtual Proxys 1, Arp Virtual Stitchs 0, Arp Virtual Floods 0, Arp Physical Stitchs 0, Arp Tor Proxys 0, Arp Physical Floods 0

Vrf: 10 Discards 2617, Resolves 0, Receives 0, L2 Receives 127021923, Vrf Translates 0, Unknown Unicast Floods 0 Ecmp Composites 0, L2 Mcast Composites 6, Fabric Composites 0, Encap Composites 0, Evpn Composites 0 Udp Tunnels 0, Udp Mpls Tunnels 0, Gre Mpls Tunnels 0, Vxlan Tunnels 0 L2 Encaps 948, Encaps 127021557 GROs 0, Diags 0 Arp Virtual Proxys 948, Arp Virtual Stitchs 0, Arp Virtual Floods 0, Arp Physical Stitchs 0, Arp Tor Proxys 0, Arp Physical Floods 0

nirhenn commented 7 years ago

Another update, from looking on the flow table query in contrail web i notice that i see allot of drops on the traffic, the common to all is that destination VN is unknown from some reason. deny2017-01-08 16:04:06:956:248node-36default-domain:Proj1:proj1_1UNKNOWN192.168.11.104192.168.100.5372415044TCP74 B1 deny2017-01-08 16:04:05:322:569node-36default-domain:Proj1:proj1_1UNKNOWN192.168.11.102192.168.100.54050325826UDP2.69 KB2 deny2017-01-08 16:04:05:265:420node-36default-domain:Proj1:proj1_1UNKNOWN192.168.11.103192.168.100.53978625826UDP0 B0 deny2017-01-08 16:04:04:438:845node-36default-domain:Proj1:proj1_1UNKNOWN192.168.11.105192.168.100.53691225826UDP1.35 KB1

kiss9 commented 7 years ago

I face the same problem, a lot of packets dropped because of "IF Drop" of vif of VM. Any clues to further debug?