srl-labs / containerlab

container-based networking labs
https://containerlab.dev
BSD 3-Clause "New" or "Revised" License
1.54k stars 263 forks source link

0.44.0 broke srlinux tcp stack #1535

Closed hellt closed 1 year ago

hellt commented 1 year ago

see https://github.com/srl-labs/learn-srlinux/pull/87#discussion_r1298976274 can reproduce with this fixed up srlfrr example (copy paste into /etc/containerlab/lab-examples/srlfrr01/srlfrr01.clab.yml):

name: srlfrr01

topology:
  nodes:
    srl:
      kind: srl
      image: ghcr.io/nokia/srlinux:23.3.3
      startup-config: |
        # configure loopback and data interfaces
        set / interface ethernet-1/1 admin-state enable
        set / interface ethernet-1/1 subinterface 0 admin-state enable
        set / interface ethernet-1/1 subinterface 0 ipv4 admin-state enable address 192.168.1.1/24

        set / interface lo0 subinterface 0 admin-state enable
        set / interface lo0 subinterface 0 ipv4 address 10.10.10.1/32
        set / network-instance default interface ethernet-1/1.0
        set / network-instance default interface lo0.0

        # configure BGP
        set / network-instance default protocols bgp admin-state enable
        set / network-instance default protocols bgp router-id 10.10.10.1
        set / network-instance default protocols bgp autonomous-system 65001
        set / network-instance default protocols bgp afi-safi ipv4-unicast admin-state enable
        set / network-instance default protocols bgp group ibgp export-policy export-lo
        set / network-instance default protocols bgp neighbor 192.168.1.2 admin-state enable
        set / network-instance default protocols bgp neighbor 192.168.1.2 peer-group ibgp
        set / network-instance default protocols bgp neighbor 192.168.1.2 peer-as 65001

        # create export policy
        set / routing-policy policy export-lo statement 10 match protocol local
        set / routing-policy policy export-lo statement 10 action policy-result accept
    frr:
      kind: linux
      image: frrouting/frr:v7.5.0
      binds:
        - daemons:/etc/frr/daemons
      exec:
        - |
          vtysh -c 'configure terminal
          interface eth1
          ip address 192.168.1.2/24
          !
          interface lo
            ip address 10.10.10.2/32
          !
          router bgp 65001
            bgp router-id 10.10.10.2
            neighbor 192.168.1.1 remote-as 65001
            !
            address-family ipv4 unicast
             network 10.10.10.2/32
            exit-address-family'

  links:
    - endpoints: ["srl:e1-1", "frr:eth1"]

Peering stays in connect state. With bgp/xdp logs indicating socket error. The errors are not present when using clab 0.43.0

additional issue: CLAB_INTFS env var is not showing 1, but 0. This needs to be fixed as well.

hellt commented 1 year ago

I did a diff between 0.43 and 0.44 with regards to e1-1 and e1-1-0 interfaces. Nothing extraordinary that jumps out to me...

find /sys/class/net/e1-1/  -type f -exec sh -c 'echo "File: $1"; cat "$1"; echo "------------------------"' _ {} \;

Left pane is 0.43

https://www.diffchecker.com/UmCegYiJ/

diff between xdp_lc_1.log - https://www.diffchecker.com/miAICkay/

hellt commented 1 year ago

In https://github.com/srl-labs/containerlab/pull/1536 I reverted the #1475 and redeployed the lab which fixed the original issue, now the hunt is on to check what broke what?