Closed jbemmel closed 2 years ago
For your reference - here is one way to allow for per-node import/export policies: https://github.com/ipspace/netlab/compare/dev...jbemmel:netlab:node-local-vrf-imexports?expand=1
I won't submit a PR - just sharing some thoughts
How do you manage to come up with these ridiculous scenarios? This is another one of the "just because you could doesn't mean that you should". Among other things, this tool should promote sane practices -- how about asking someone with a decade or two of operational experience running big networks on the customer side (not working for a vendor) whether new ideas make operational sense and are a good design practice before spending your time writing the code?
Now for the technical reasons against "same VRF with different import/export policies per node". When I was still teaching the MPLS/VPN course (and please don't waste everyone's time telling me how EVPN is different), I made a simple recommendation: if two interfaces could use the same forwarding table, they could be in the same VRF. If not, then you need two VRFs. The same principle should be applied across multiple devices. Same forwarding information => same VRF. Different forwarding information => different VRFs.
As VRFs with different import/export policies clearly belong to the "different forwarding information" category, they should use different VRFs. It's not like we have a tightly constrained namespace.
Also, I'm positive that there must be someone who spent decades studying ontology and came to the conclusion that it's a bad idea to call different things by the same name. Kant immediately came to mind as the first choice should someone wish to dig deeper into this interesting subject. Using the same name for VRFs that do different things would be a perfect example of what not to do.
Finally, if you absolutely insist on doing the wrong thing, you can still set different import/export policies in individual nodes -- after all, the title of this case is "node local VRF policies" -- so why are we even having this discussion?
When using a VRF in an EVPN context with symmetric irb, each VRF is assigned a unique transit_vni. That value has to match across the various devices participating in that VRF. It is not possible to use 2 different VRF instances and still let each have the same l3 vni value - the transit_vni is (currently) part of the VRF's identity, just like its name
If we were to relax that constraint and allow multiple vrfs to use the same transit_vni as long as they are not instantiated on the same node(s), then it would be possible to use 2 different vrfs, each with globally consistent import/export policies. I tried setting the evpn.transit_vni locally for different global vrfs to the same value - but this hits a problem with https://github.com/ipspace/netlab/blob/dev/netsim/modules/evpn.py#L124 which only checks global vrfs.
I'll concede that EVPN import/export route targets are typically configured identically (i.e. import=export and the same on every VTEP), and not hub/spoke-like as I'm suggesting here. I'll also take your word for it that in an MPLS/VPN context one would use separate upstream/downstream VRFs, and not a single VRF. So perhaps the issue is that there is some logical friction between a "VRF" in MPLS/VPN context and a "VRF" in EVPN context?
Finally, I can indeed proceed by copy/pasting the setting to the individual node level, and bypass the tool's insistence on identical import/export policies per node. There is some level of inconsistency in enforcing these constraints through groups but not for individual nodes, I guess for me that's a feature - not a bug.
The only response I can make are two four-letter words, one of them being EVPN 🥴
On a more serious note, we should start with "how is one supposed to implement a common services VPN (or overlapping VPNs) with EVPN and symmetric IRB", and once we find a sane and scalable design, tweak the data model to support it.
Quick short-term fixes could include:
There is some level of inconsistency in enforcing these constraints through groups but not for individual nodes, I guess for me that's a feature - not a bug.
Yeah, I'm trying to find a balance between being flexible enough to support intentional stupidity MacGyverism and having guardrails for people who want to get the job done and might create an inadvertent hard-to-find typo.
Maybe I should put the above paragraph into "developer guidelines" 😇
How about we allow for evpn.transit_vni to be the same on different VRFs if and only if the import/export clauses are mirrored, to allow evpn hub/spoke topologies?
I appreciate this is a corner case and we don't want to spend a lot of time solving corner cases, but I think this would be a small PR that would provide the required flexibility while still providing a guard rail, without affecting more common usage
How about we allow for evpn.transit_vni to be the same on different VRFs if and only if the import/export clauses are mirrored, to allow evpn hub/spoke topologies?
How about we start with finding someone who knows enough about EVPN with VXLAN to answer three questions:
I appreciate this is a corner case and we don't want to spend a lot of time solving corner cases, but I think this would be a small PR that would provide the required flexibility while still providing a guard rail, without affecting more common usage
Small PRs sometimes introduce lots of unintended consequences ;) I would say there's no good reason to configure transit VNI by hand when building a simple lab, so whoever does that should know full well what they're doing (similar to static IP addresses).
I consulted with EVPN experts, and they indicated that ideally different VNI values would be used in the data plane, for each direction. This allows for topologies where a mix of hub/spoke VRFs reside on the same node, and different processing can be triggered in hardware based on the different tags.
Unfortunately, SR Linux (at least) currently requires the l3 VNI tag to be the same in both directions; RT5 routes with a non-matching tag don't get installed even when the RT does match.
Hence, I am proposing a compromise PR: https://github.com/ipspace/netlab/pull/502, i.e. by default generate the error that we currently have, but allow users to disable this check globally by setting a flag.
I consulted with EVPN experts, and they indicated that ideally different VNI values would be used in the data plane, for each direction. This allows for topologies where a mix of hub/spoke VRFs reside on the same node, and different processing can be triggered in hardware based on the different tags.
Exactly as I thought ;) Copying MPLS/VPN design but failing due to VXLAN-based implementations ;))
Unfortunately, SR Linux (at least) currently requires the l3 VNI tag to be the same in both directions; RT5 routes with a non-matching tag don't get installed even when the RT does match.
I think many other platforms behave the same way. They need a VXLAN interface and an internal VLAN (for the VNI) and they are not willing to create it automatically based on EVPN updates
Hence, I am proposing a compromise PR: #502, i.e. by default generate the error that we currently have, but allow users to disable this check globally by setting a flag.
How about adding the ability to use another VRF name in the transit_vni? Will code that in a feature branch...
In the context of https://github.com/ipspace/netlab/commit/de2232e9a4e926d9ecac16c056f62b36af7647d1
At https://github.com/jbemmel/netsim-examples/tree/evpn-vxlan-to-hosts-v2/BGP/EVPN-VXLAN-to-hosts there is a use case for VRF leaking. The intent is to do VRF route leaking to 'internet' on the frr nodes only, not everywhere the vrf exists.
To achieve this, 'internet' is defined as a global vrf https://github.com/jbemmel/netsim-examples/blob/evpn-vxlan-to-hosts-v2/BGP/EVPN-VXLAN-to-hosts/topology.yml#L37 but the import/export attributes are set as 'spoke' for the frr group (https://github.com/jbemmel/netsim-examples/blob/evpn-vxlan-to-hosts-v2/BGP/EVPN-VXLAN-to-hosts/topology.yml#L84) and as 'hub' for a hub group (spine only): https://github.com/jbemmel/netsim-examples/blob/evpn-vxlan-to-hosts-v2/BGP/EVPN-VXLAN-to-hosts/topology.yml#L100
With the new code this is not allowed:
From the documentation I get the impression that group level import/export policies get promoted to global attributes, and I'm wondering if that is always correct. The VRF id needs to be global, but import/export policies can vary per node (or group of nodes) - for example in case of a hub/spoke topology (with asymmetrical swapped import/export targets)
Sample hub/spoke topology:
If a user wants global import/export policies, they can (and should) define them on a global vrf (imho)