ipspace / netlab

Making virtual networking labs suck less
https://netlab.tools
Other
458 stars 69 forks source link

Stub links not configured on SR Linux #171

Closed ipspace closed 2 years ago

ipspace commented 2 years ago

This topology:

provider: clab
defaults:
  device: srlinux

module: [ ospf ]

nodes:
  s1:
  s2:
  s3:
    device: cumulus
    runtime: docker

links: [ s1-s2, s2-s3, s1-s2-s3, s1, s2 ]

... crashes when configuring OSPF on SR Linux. The root cause seems to be lack of interface configuration on stub links that are created (with Linux bridges) but not configured in the initial configuration template.

ipspace commented 2 years ago

Merged you PR, started the same topology. No configuration errors, but OSPF does not start between SR Linux and Cumulus, and I don't know enough about SR Linux to troubleshoot it. When changing all three nodes to Cumulus everything works.

jbemmel commented 2 years ago

On Ansible 2.9.1 Cumulus deployment fails:

TASK [Deploy initial device configuration] ************************************************************************************************************************************************************************
fatal: [s3]: FAILED! => 
  reason: |-
    this task 'ansible.builtin.command' has extra params, which is only allowed in the following modules: group_by, win_shell, include, import_role, import_tasks, shell, include_tasks, include_role, set_fact, add_host, command, raw, meta, script, include_vars, win_command

    The error appears to be in '/home/jeroen/srlinux/netsim-tools/netsim/ansible/tasks/deploy-config/cumulus.yml': line 10, column 3, but may
    be elsewhere in the file depending on the exact syntax problem.

    The offending line appears to be:

    - name: run /tmp/config.sh to deploy config
      ^ here

Upgrade to Ansible 5.2.0 fixes it

After updating the topology to use OVS bridges:

defaults:
  device: srlinux
  providers:
    clab:
      bridge_type: ovs-bridge

the OSPF debug logs on SR Linux show:

2022-01-21T18:35:53.237898+00:00 s1 local6|WARN sr_ospf_mgr: ospf|2416|2419|00081|W: Network-instance default - OSPF instance 0: Neighbor 172.16.0.3, using interface ethernet-1/2.0, signaled an unacceptable MTU

For SR Linux - SR OS interop I configured MTU=1500 on SR OS( see https://github.com/ipspace/netsim-tools/blob/master/netsim/ansible/templates/ospf/sros.gnmi.j2#L30)

Cumulus sends:

18:42:23.266166 aa:c1:ab:96:cf:ff > 1a:b0:00:ff:00:02, ethertype IPv4 (0x0800), length 66: (tos 0xc0, ttl 1, id 65010, offset 0, flags [none], proto OSPF (89), length 52)
    172.16.0.3 > 172.16.0.1: OSPFv2, Database Description, length 32
    Router-ID 10.0.0.3, Backbone Area, Authentication Type: none (0)
    Options [External], DD Flags [Init, More, Master], MTU: 9500, Sequence: 0x2ed115c8

while SR Linux allows:

A:s1# mtu <value>                                                                                                                                                                                                  
usage: mtu <value>

MTU for the OSPF to use on the interface. For OSPFv3 this must be minimum 1280.
If the MTU defined here exceeds the actual IP-MTU of the interface, then the
IP-MTU of the interface is used.

Positional arguments:
  value             [number, range 512..9486]

After setting MTU to 1500 on Cumulus:

s3(bash)#ip link set swp2 mtu 1500
s3(bash)#ip link set swp1 mtu 1500

OSPF comes up.

The default IP MTU of 1500 on SR Linux comes from:

--{ + candidate shared default }--[ system mtu ]--                                                                                                                                                                 
A:s1# info detail                                                                                                                                                                                                  
    default-port-mtu 9232
    default-l2-mtu 9232
    default-ip-mtu 1500  <-- here
    min-path-mtu 552

A:s1# default-ip-mtu <value:1500>                                                                                                                                                                                  
usage: default-ip-mtu <value>

System default IP MTU in bytes including the IP header but excluding Ethernet overhead

The 7220 IXR-D1, 7220 IXR-D2, 7220 IXR-D3, 7220 IXR-H2, and 7220 IXR-H3 systems support a maximum IP MTU of 9398 bytes.

Positional arguments:
  value             [number, range 1280..9486, default 1500]

Looks like we need to model MTU settings to make things interwork

ipspace commented 2 years ago

It's always the MTU (or DNS or BGP). Thanks a million for a quick response and #177.

We also have to agree on minimum Ansible version. Opening another issue...