sonic-net / sonic-buildimage

Scripts which perform an installable binary image build for SONiC
Other
730 stars 1.4k forks source link

Default route removed from APP_DB on shutdown of management port. #9188

Closed arlakshm closed 2 years ago

arlakshm commented 2 years ago

Description

On making port eth0 down the default route is removed from APP_DB.

Steps to reproduce the issue:

  1. Learn the default route via BGP.
  2. Confirm the default route is present in APP_DB
  3. shutdown the management port

Describe the results you received:

The default route is removed from the APP_DB.

Describe the results you expected:

The default route should not be removed from APP_DB. The default route is added back only after shutdown and startup of BGP session.

Output of show version:

This issue is seen on Master, 20201231 and 202106 branches.

CLI output

admin@sonic:~$ show ip bgp summary

IPv4 Unicast Summary:
BGP router identifier 10.1.0.32, local AS number 65100 vrf-id 0
BGP table version 6428
RIB entries 12851, using 2467392 bytes of memory
Peers 24, using 523584 KiB of memory
Peer groups 4, using 256 bytes of memory

Neighbhor      V     AS    MsgRcvd    MsgSent    TblVer    InQ    OutQ  Up/Down      State/PfxRcd  NeighborName
-----------  ---  -----  ---------  ---------  --------  -----  ------  ---------  --------------  --------------
10.0.0.1       4  65200       3195         27         0      0       0  00:05:45             6370  ARISTA01T2
10.0.0.5       4  65200       3198         29         0      0       0  00:05:45             6370  ARISTA03T2
10.0.0.9       4  65200       3195         27         0      0       0  00:05:45             6370  ARISTA05T2
10.0.0.13      4  65200       3195         28         0      0       0  00:05:45             6370  ARISTA07T2
10.0.0.17      4  65200       3195         27         0      0       0  00:05:45             6370  ARISTA09T2
10.0.0.21      4  65200       3195         27         0      0       0  00:05:45             6370  ARISTA11T2
10.0.0.25      4  65200       3195         27         0      0       0  00:05:45             6370  ARISTA13T2
10.0.0.29      4  65200       3195         28         0      0       0  00:05:45             6370  ARISTA15T2
10.0.0.33      4  64001         12       3213         0      0       0  00:05:47                4  ARISTA01T0
10.0.0.35      4  64002         11       3212         0      0       0  00:05:04                3  ARISTA02T0
10.0.0.37      4  64003         12       3212         0      0       0  00:05:04                4  ARISTA03T0
10.0.0.39      4  64004         11       3212         0      0       0  00:05:04                3  ARISTA04T0
10.0.0.41      4  64005         11       3212         0      0       0  00:05:04                3  ARISTA05T0
10.0.0.43      4  64006         11       3212         0      0       0  00:05:04                3  ARISTA06T0
10.0.0.45      4  64007         11       3212         0      0       0  00:05:04                3  ARISTA07T0
10.0.0.47      4  64008         11       3212         0      0       0  00:05:00                3  ARISTA08T0
10.0.0.49      4  64009         11       3212         0      0       0  00:05:00                3  ARISTA09T0
10.0.0.51      4  64010         11       3212         0      0       0  00:05:00                3  ARISTA10T0
10.0.0.53      4  64011         11       3212         0      0       0  00:05:00                3  ARISTA11T0
10.0.0.55      4  64012         11       3212         0      0       0  00:05:00                3  ARISTA12T0
10.0.0.57      4  64013         11       3212         0      0       0  00:05:00                3  ARISTA13T0
10.0.0.59      4  64014         11       3213         0      0       0  00:05:46                3  ARISTA14T0
10.0.0.61      4  64015         11       3213         0      0       0  00:05:46                3  ARISTA15T0
10.0.0.63      4  64016         11       3213         0      0       0  00:05:45                3  ARISTA16T0

Total number of neighbors 24
admin@sonic:~$ show ip route 0.0.0.0/0
Routing entry for 0.0.0.0/0
  Known via "bgp", distance 20, metric 0, best
  Last update 00:05:51 ago
  * 10.0.0.1, via PortChannel0002, weight 1
  * 10.0.0.5, via PortChannel0005, weight 1
  * 10.0.0.9, via PortChannel0008, weight 1
  * 10.0.0.13, via PortChannel0011, weight 1
  * 10.0.0.17, via PortChannel0014, weight 1
  * 10.0.0.21, via PortChannel0017, weight 1
  * 10.0.0.25, via PortChannel0020, weight 1
  * 10.0.0.29, via PortChannel0023, weight 1

Routing entry for 0.0.0.0/0
  Known via "static", distance 200, metric 0
  Last update 00:06:17 ago
    10.3.146.1, via eth0, weight 1

admin@sonic:~$ redis-cli KEYS *0.0.0.0*
1) "ROUTE_TABLE:0.0.0.0/0"
2) "ROUTE_TABLE:10.0.0.0/31"
3) "INTF_TABLE:PortChannel0002:10.0.0.0/31"
admin@sonic:~$ redis-cli hgetall "ROUTE_TABLE:0.0.0.0/0"
1) "nexthop"
2) "10.0.0.1,10.0.0.5,10.0.0.9,10.0.0.13,10.0.0.17,10.0.0.21,10.0.0.25,10.0.0.29"
3) "ifname"
4) "PortChannel0002,PortChannel0005,PortChannel0008,PortChannel0011,PortChannel0014,PortChannel0017,PortChannel0020,PortChannel0023"
admin@sonic:~$
admin@sonic:~$ sudo ifconfig eth0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.3.146.185  netmask 255.255.254.0  broadcast 10.3.147.255
        inet6 fc00:2::32  prefixlen 64  scopeid 0x0<global>
        inet6 fe80::968e:d3ff:fe99:10  prefixlen 64  scopeid 0x20<link>
        ether 94:8e:d3:99:00:10  txqueuelen 1000  (Ethernet)
        RX packets 16722  bytes 1224912 (1.1 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 2453  bytes 441789 (431.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 29

admin@sonic:~$

admin@sonic:~$ show ip bgp summary

IPv4 Unicast Summary: BGP router identifier 10.1.0.32, local AS number 65100 vrf-id 0 BGP table version 6428 RIB entries 12851, using 2467392 bytes of memory Peers 24, using 523584 KiB of memory Peer groups 4, using 256 bytes of memory

Neighbhor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd NeighborName


10.0.0.1 4 65200 3197 29 0 0 0 00:07:05 6370 ARISTA01T2 10.0.0.5 4 65200 3200 31 0 0 0 00:07:05 6370 ARISTA03T2 10.0.0.9 4 65200 3197 29 0 0 0 00:07:05 6370 ARISTA05T2 10.0.0.13 4 65200 3197 30 0 0 0 00:07:05 6370 ARISTA07T2 10.0.0.17 4 65200 3197 29 0 0 0 00:07:05 6370 ARISTA09T2 10.0.0.21 4 65200 3197 29 0 0 0 00:07:05 6370 ARISTA11T2 10.0.0.25 4 65200 3197 29 0 0 0 00:07:05 6370 ARISTA13T2 10.0.0.29 4 65200 3197 30 0 0 0 00:07:05 6370 ARISTA15T2 10.0.0.33 4 64001 14 3215 0 0 0 00:07:07 4 ARISTA01T0 10.0.0.35 4 64002 12 3213 0 0 0 00:06:24 3 ARISTA02T0 10.0.0.37 4 64003 13 3213 0 0 0 00:06:24 4 ARISTA03T0 10.0.0.39 4 64004 12 3213 0 0 0 00:06:24 3 ARISTA04T0 10.0.0.41 4 64005 12 3213 0 0 0 00:06:24 3 ARISTA05T0 10.0.0.43 4 64006 12 3213 0 0 0 00:06:24 3 ARISTA06T0 10.0.0.45 4 64007 12 3213 0 0 0 00:06:24 3 ARISTA07T0 10.0.0.47 4 64008 12 3213 0 0 0 00:06:20 3 ARISTA08T0 10.0.0.49 4 64009 12 3213 0 0 0 00:06:20 3 ARISTA09T0 10.0.0.51 4 64010 12 3213 0 0 0 00:06:20 3 ARISTA10T0 10.0.0.53 4 64011 12 3213 0 0 0 00:06:20 3 ARISTA11T0 10.0.0.55 4 64012 12 3213 0 0 0 00:06:20 3 ARISTA12T0 10.0.0.57 4 64013 12 3213 0 0 0 00:06:20 3 ARISTA13T0 10.0.0.59 4 64014 13 3215 0 0 0 00:07:06 3 ARISTA14T0 10.0.0.61 4 64015 13 3215 0 0 0 00:07:06 3 ARISTA15T0 10.0.0.63 4 64016 13 3215 0 0 0 00:07:05 3 ARISTA16T0

Total number of neighbors 24 admin@sonic:~$ show ip route 0.0.0.0/0 Routing entry for 0.0.0.0/0 Known via "static", distance 200, metric 0 Last update 00:00:49 ago 10.3.146.1 inactive, weight 1

Routing entry for 0.0.0.0/0 Known via "bgp", distance 20, metric 0, best Last update 00:07:15 ago

admin@sonic:~$ ip route show 0.0.0.0/0 default proto bgp src 10.1.0.32 metric 20 nexthop via 10.0.0.1 dev PortChannel0002 weight 1 nexthop via 10.0.0.5 dev PortChannel0005 weight 1 nexthop via 10.0.0.9 dev PortChannel0008 weight 1 nexthop via 10.0.0.13 dev PortChannel0011 weight 1 nexthop via 10.0.0.17 dev PortChannel0014 weight 1 nexthop via 10.0.0.21 dev PortChannel0017 weight 1 nexthop via 10.0.0.25 dev PortChannel0020 weight 1 nexthop via 10.0.0.29 dev PortChannel0023 weight 1 admin@sonic:~$ redis-cli KEYS 0.0.0.0/0 (empty array)

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

MaratGubaiev commented 2 years ago

Hi @arlakshm,

I have started to fix the issue. I managed to reproduce it only after reboot:

admin@sonic:~$ redis-cli KEYS *0.0.0.0*
1) "INTF_TABLE:Ethernet0:10.0.0.0/31"
admin@sonic:~$ sudo ifconfig eth0 down
admin@sonic:~$ redis-cli KEYS *0.0.0.0*
1) "INTF_TABLE:Ethernet0:10.0.0.0/31"

sudo reboot

admin@sonic:~$ redis-cli KEYS *0.0.0.0*
(empty array)
admin@sonic:~$ sudo route add default gw 192.168.111.3 eth0
admin@sonic:~$ redis-cli KEYS *0.0.0.0*
1) "INTF_TABLE:Ethernet0:10.0.0.0/31"

Then I added

"MGMT_INTERFACE": {
        "eth0|192.168.111.214/24": {
            "gwaddr": "192.168.111.3"
        }
    }

to the /etc/sonic/config_db.json and did "config load", so I got

admin@sonic:~$ show management_interface address
Management IP address = 192.168.111.214/24
Management Network Default Gateway = 192.168.111.3

After it I cannot reproduce the issue even after reboot. (Immediately after reboot I get (empty array), but after several seconds it works normally.)

Also adding MGMT_INTERFACE fixed such an issue:

admin@sonic:~$ ip route show 0.0.0.0/0
default via 192.168.111.3 dev eth0
admin@sonic:~$ sudo ifconfig eth0 down
admin@sonic:~$ ip route show 0.0.0.0/0
admin@sonic:~$ sudo ifconfig eth0 up
admin@sonic:~$ ip route show 0.0.0.0/0
admin@sonic:~$

Platform is "mellanox".

Is it the decision? Do I understand everything right?

Best regards, Marat Gubaiev

prsunny commented 2 years ago

Fixed by patch