aristanetworks / avd

Arista Validated Designs
https://avd.arista.com
Apache License 2.0
290 stars 210 forks source link

BGP neighbor's ip address should not be the key in the YAML schema #1045

Closed sc68cal closed 3 years ago

sc68cal commented 3 years ago

Let's say you have the following in host_vars for the switch myswitch.foo.bar

uplink1: 128.66.0.2/31
uplink2: 128.66.1.2/31
uplink3: 128.66.2.2./31
uplink4: 128.66.3.2/31

And so on and so forth. For every uplink you have been given, your remote end is using the first IP address in the block and you've been given the second.

Now, it's time to configure your BGP neighbors using the ansible-avd collection, and the eos_cli_config_gen role specifically.

Now, previously you had code that used https://github.com/ansible-collections/arista.eos/blob/main/docs/arista.eos.eos_bgp_global_module.rst - so you were able to do a couple neat little tricks to reduce the amount of variables you're carrying around

bgp_neighbors:
  - peer: "{{ uplink1 | ansible.netcommon.ipmath(-1) }}"
    remote_as: "{{ upstream_asn }}"
  - peer: "{{ uplink2 | ansible.netcommon.ipmath(-1) }}"
    remote_as: "{{ upstream_asn }}"

The point being, that the YAML schema used an array for all the bgp neighbors, with each neighbor being a dictionary, and a peer attribute that had the IP address of the neighbor as the value.

It is not possible with Ansible-AVD to do something similar, since each BGP neighbor is a dictionary in YAML, with each key in the dictionary being the IP address, which is not valid YAML and Ansible does not substitute the Jinja2 expression

router_bgp:
  as: "{{ our_asn }}"
  neighbors:
   "{{ uplink1 | ansible.netcommon.ipmath(-1) }}":
      remote_as: "{{ upstream_asn }}"

This results in an invalid config

router bgp 65000
   router-id 96.216.37.2/30
   neighbor {{ uplink1 | ansible.netcommon.ipmath(-1) }} remote-as 65002
!

https://github.com/ansible/ansible/issues/17324

fredhsu commented 3 years ago

@carlbuchmann Any thoughts behind using the peer as the key vs listing the peers and making it a value?

carlbuchmann commented 3 years ago

@sc68cal - Thank you for providing feedback on the data model.

Just to clarify, what is not valid YAML schema is using a jinja2 expression as the key, I believe this is what you are stating.

I understand where you are coming from and what you are stating makes perfect sense when trying to consuming eos_cli_config_gen natively using group_vars/host_vars.

However, eos_cli_config_gen is designed in mind that there is a higher level fabric abstraction i.e eos_designs or similar role driving this configuration. In other words, the schema in eos_cli_config_gen, it’s simply a YAML representation of the EOS CLI and nothing else. We avoid any kind of abstraction in this layer, it's a one-to-one mapping.

The reason we have chosen to leverage a schema with dictionary keys vs. lists are the following:

As an example of driving the underlay configuration in eos_designs you can look at this template. https://github.com/aristanetworks/ansible-avd/blob/devel/ansible_collections/arista/avd/roles/eos_designs/templates/designs/l3ls-evpn/underlay/main.j2

carlbuchmann commented 3 years ago

Additionally this is the specific template that drive eBGP peering: https://github.com/aristanetworks/ansible-avd/blob/devel/ansible_collections/arista/avd/roles/eos_designs/templates/designs/l3ls-evpn/underlay/ebgp/router-bgp.j2

sc68cal commented 3 years ago

The only design that you have documented in eos_designs is a design for doing BGP/EVPN with VXLAN. That is not what I am tasked with implementing. What I am attempting to build is very simple, just a spine layer that connects to the top of rack leafs via BGP and acts as an aggregation layer. My preference was to use parts of Ansible-AVD since that is where your company has indicated they are providing support for Ansible automation with Arista EOS.

My hope was to just consume the eos_cli_config_gen role and the templates for configuring the things that I needed configuring, without having to dive into the eos_designs role and try to figure out how to hack up a topology that implements what I was looking for. My main fear is that by focusing on BGP/EVPN and VXLAN you have made it very easy for those that are looking to do that, while also making it extremely difficult for someone attempting to do anything else.

carlbuchmann commented 3 years ago

@sc68cal You are absolutely correct in stating that currently we only support BGP/EVPN VXLAN design in this role at this time. However, we have been in the process of refactoring this role (in fact, it used to be called eos_l3ls_evpn) in order to support various use cases and designs, notably: pure L3LS, L2LS, WAN w/MPLS/SR/LDP, etc... This is enabled by the action plugin yaml_template_to_facts (https://avd.sh/en/latest/plugins/index.html#yaml-templates-to-facts), which allow us to re-use the various template building blocks across designs.

What other customers have done at this time is create their own abstraction role similar to how eos_designs function!

We would be happy to discuss and understand your use case to ensure we prioritize accordingly and address any gaps!

sc68cal commented 3 years ago

That's what bothers me about the structure and design of this project. In order to get even the smallest configuration built, I have to wade through at least three layers of indirection (yaml_template_to_facts, eos_designs then eos_cli_config_gen). That bothers me. In the case of eos_cli_config_gen the abstraction layer is relatively thin, all I have to do is figure out what the structure of the YAML needs to be, to match the configuration that I want to be rendered by the template. That's fine, since that's pretty much the same thing I have to do for pretty much every other network OS' Ansible modules or roles.

What troubles me is the upper layers of abstraction. yaml_template_to_facts has just one small example documented, for what is probably a very very complicated piece of code to drive the lower layers. Same with eos_designs.

Again, it appears that while Ansible-AVD has "made the impossible, possible" (BGP/EVPN) it has not made "the possible, easy" or "the easy, elegant" as the saying goes.

At this point I'll probably go back to the arista.eos collection since there, I can slowly build parts of my config and demonstrate forward progress every sprint. The trouble I have with AVD is that you have to eat the whole elephant, as it were, before you get any meaningful forward progress.

carlbuchmann commented 3 years ago

@sc68cal, I'm sorry that you feel this way, and as we are working on the 3.0.0 milestone, we plan on providing more examples in this area as we support more design or how to customize/extend it to support your own!

I wanted to address your comments relating to yaml_templates_to_facts. It is not an abstraction layer, it is simply a mechanism that allows us to "stack" jinja2 templates together more dynamically to allow better re-usability of the jinja2 templates we have developed. It simply replaces the use of the template module + jinja2 root template with lots of include statements! It also allows the customers to override defaults and add their own templates to accommodate their own needs! You can certainly look under the hood here -> https://github.com/aristanetworks/ansible-avd/blob/devel/ansible_collections/arista/avd/plugins/action/yaml_templates_to_facts.py

As I said, we are not forcing you to use eos_designs and you can create your own role with the Ansible template module that generates the EOS "structured configuration" in YAML. The benefit of this approach is that eos_cli_config_gen will create per device as-built documentation, and you will be able to take advantage of the eos_validate_state role to dynamically perform network-ready-for-use tests and report on your network based on your intent! Perhaps look at the previous version of eos_designs, eos_l3ls_evpn to provide some inspiration on how to create this role for you. -> https://github.com/aristanetworks/ansible-avd/blob/v1.1.3/ansible_collections/arista/avd/roles/eos_l3ls_evpn/tasks/main.yml (You could probably pick apart all the EVPN stuff, and have everything you need to build a pure l3 leaf-spine fabric!)

In the end, pick which way works best for you and your team, but if you decide to go the route of AVD, we would be happy to have live sessions to answer your questions and help you be successful with automating your network!

ClausHolbechArista commented 3 years ago

@sc68cal I did a small mock up of your variables, and with some extra json conversion it can work.

uplink1: 128.66.0.2/31
uplink2: 128.66.1.2/31
uplink3: 128.66.2.2/31
uplink4: 128.66.3.2/31
our_asn: 123
upstream_asn: 321

custom_structured_configuration_router_bgp: 
  as: "{{ our_asn }}"
  neighbors: "{{ { uplink1 | ansible.netcommon.ipmath(-1) : { 'remote_as': upstream_asn },
                   uplink2 | ansible.netcommon.ipmath(-1) : { 'remote_as': upstream_asn },
                   uplink3 | ansible.netcommon.ipmath(-1) : { 'remote_as': upstream_asn },
                   uplink4 | ansible.netcommon.ipmath(-1) : { 'remote_as': upstream_asn } } }}"

Produces this (ignore the custom_structured_configuration_ prefix for your usecase):

router_bgp:
  as: 123
  neighbors:
    128.66.0.1:
      remote_as: 321
    128.66.1.1:
      remote_as: 321
    128.66.2.1:
      remote_as: 321
    128.66.3.1:
      remote_as: 321

Could be optimized by setting a var like uplink_bgp_template to ease setting multiple bgp knobs:

uplink_bgp_template:
  remote_as: 321

custom_structured_configuration_router_bgp: 
  as: "{{ our_asn }}"
  neighbors: "{{ { uplink1 | ansible.netcommon.ipmath(-1) : uplink_bgp_template,
                   uplink2 | ansible.netcommon.ipmath(-1) : uplink_bgp_template,
                   uplink3 | ansible.netcommon.ipmath(-1) : uplink_bgp_template,
                   uplink4 | ansible.netcommon.ipmath(-1) : uplink_bgp_template } }}"
sc68cal commented 3 years ago

@ClausHolbechArista thank you for the example. We've done something similar to that for a different set of Ansible automation with different devices, but it was implemented poorly and was awful to look at and read, so as a general rule we avoid doing the JSON-> YAML conversion trick. Your style is more compact and concise, so this is certainly something to look at further.