kamelnetworks / sonic

Our own public documentation on how to do things in SONiC
https://kamelnetworks.github.io/sonic/
2 stars 1 forks source link

ARP over EVPN blackholed from Sonic to Arista switch #9

Open bluecmd opened 3 years ago

bluecmd commented 3 years ago
+----------------------+                  +----------------------+
|                      |                  |                      |
|                      |                  |                      |
|                      |                  |                      |
|                      |   EVPN / VXLAN   |                      |
|        sonic         +------------------>        arista        |
|                      |                  |                      |
|                      |                  |                      |
|                      |                  |                      |
+-----------^----------+                  +----------+-----------+
            |                                        |
            |                                        X
            |                                        |
        +---+--+                                 +---v--+
        |      |                                 |      |
        |  c1  |                                 |  c2  |
        |      |                                 |      |
        +------+                                 +------+

When trying to ping ICMP (IPv4) c2 from c1 the ping fails. Trying ICMPv6 or using static ARP works fine. ARP request makes it all the way to c1, reply is generated and makes it into the arista switch but never out to the link towards c2.

ARP supression on arista?

bluecmd commented 3 years ago

We see that Arista send Type-2 EVPN entries with IP for both IPv4 and IPv6. SONiC also installs the ARP supressed neighbor for IPv4:

(vrf:mgmt)bluecmd@celeste:~$ ip neigh | grep 76:c0:
192.168.216.11 dev Vlan1991 lladdr 76:c0:f8:4b:c8:5c extern_learn  NOARP
fe80::74c0:f8ff:fe4b:c85d dev Vlan1991 lladdr 76:c0:f8:4b:c8:5d router REACHABLE

Arista BGP objects:

andrea#show bgp evpn
BGP routing table information for VRF default
Router identifier 10.0.0.12, local AS number 65002
Route status codes: s - suppressed, * - valid, > - active, E - ECMP head, e - ECMP
                    S - Stale, c - Contributing to ECMP, b - backup
                    % - Pending BGP convergence
Origin codes: i - IGP, e - EGP, ? - incomplete
AS Path Attributes: Or-ID - Originator ID, C-LST - Cluster List, LL Nexthop - Link Local Nexthop

          Network                Next Hop              Metric  LocPref Weight  Path
 * >     RD: 10.0.0.11:2 mac-ip 0009.0f09.d401
                                 10.0.0.11             -       100     0       65001 i
 * >     RD: 10.0.0.12:1991 mac-ip 0050.5655.f7b0
                                 -                     -       -       0       i
 * >     RD: 10.0.0.12:1991 mac-ip 0050.565a.fc2b
                                 -                     -       -       0       i
 * >     RD: 10.0.0.12:1991 mac-ip 0050.565b.fb41
                                 -                     -       -       0       i
 * >     RD: 10.0.0.12:1991 mac-ip 0050.565c.4f4a
                                 -                     -       -       0       i
 * >     RD: 10.0.0.12:1991 mac-ip 0050.565c.f6d4
                                 -                     -       -       0       i
 * >     RD: 10.0.0.12:1991 mac-ip 0050.565e.e2dc
                                 -                     -       -       0       i
 * >     RD: 10.0.0.12:1991 mac-ip 0050.5697.f36d
                                 -                     -       -       0       i
 * >     RD: 10.0.0.12:1991 mac-ip 3c2c.3078.5b80
                                 -                     -       -       0       i
 * >     RD: 10.0.0.12:1991 mac-ip 76c0.f84b.c85c
                                 -                     -       -       0       i
 * >     RD: 10.0.0.12:1991 mac-ip 76c0.f84b.c85c 192.168.216.11
                                 -                     -       -       0       i
 * >     RD: 10.0.0.11:2 mac-ip 76c0.f84b.c85d
                                 10.0.0.11             -       100     0       65001 i
 * >     RD: 10.0.0.11:2 mac-ip 76c0.f84b.c85d fe80::74c0:f8ff:fe4b:c85d
                                 10.0.0.11             -       100     0       65001 i
 * >     RD: 10.0.0.11:2 imet 10.0.0.11
                                 10.0.0.11             -       100     0       65001 i
 * >     RD: 10.0.0.12:1991 imet 10.0.0.12
                                 -                     -       -       0       i

Whatevver we try we cannot seem to get the Arista to pass ARPs

bluecmd commented 2 years ago

Possible patch that I whipped up from the latest discussions:

diff --git a/orchagent/vxlanorch.cpp b/orchagent/vxlanorch.cpp
index d983052..64b667c 100644
--- a/orchagent/vxlanorch.cpp
+++ b/orchagent/vxlanorch.cpp
@@ -1691,7 +1691,7 @@ bool VxlanTunnelMapOrch::addOperation(const Request& request)
     if (vni_id >= 1<<24)
     {
         SWSS_LOG_ERROR("Vxlan tunnel map vni id is too big: %d", vni_id);
-        return true;
+        return false;
     }

     tempPort.m_vnid = (uint32_t) vni_id;
@@ -1716,12 +1716,19 @@ bool VxlanTunnelMapOrch::addOperation(const Request& request)

     if (!tunnel_obj->isActive())
     {
+        auto encap_ttl  = static_cast<sai_uint32_t>(request.getAttrUint("encap_ttl"));
+        if (encap_ttl > 255)
+        {
+            SWSS_LOG_ERROR("Vxlan tunnel map encap TTL is too big: %d", encap_ttl);
+            return false;
+        }
         //@Todo, currently only decap mapper is allowed
         //tunnel_obj->createTunnel(MAP_T::MAP_TO_INVALID, MAP_T::VNI_TO_VLAN_ID);
         uint8_t mapper_list = 0;
         TUNNELMAP_SET_VLAN(mapper_list);
         TUNNELMAP_SET_VRF(mapper_list);
-        tunnel_obj->createTunnelHw(mapper_list,TUNNEL_MAP_USE_DEDICATED_ENCAP_DECAP);
+        tunnel_obj->createTunnelHw(mapper_list, TUNNEL_MAP_USE_DEDICATED_ENCAP_DECAP,
+                                   /* with_term */ true, encap_ttl);
     }

     const auto tunnel_map_id = tunnel_obj->getDecapMapId(TUNNEL_MAP_T_VLAN);

This might allow enabling PIPE with specific TTL like this:

{
    "VXLAN_TUNNEL_MAP": {
        "nve1|map_1991_Vlan1991": {
            "vlan": "Vlan1991",
            "vni": "1991",
            "encap_ttl": 64
        }
    }
}