Closed rcgoodfellow closed 1 month ago
I was using a4x2 with:
02303a6f03b19b8476fc4bc4a65d6b4e29585c6c
b53f23993fbd2e6c5251ddf10ae5b9fd7813a883
with merge from https://github.com/oxidecomputer/oxide.rs/pull/808 commit 457db93dc13a09d45df09597dd0a3c1becd0b086
, resolved merge conflicts by hand.Initial state on g3 (serves requests from Oxide CLI):
root@oxz_switch:~# mgadm bgp status imported 65547
BGP Routes
=============
Prefix Nexthop Local Pref Origin AS Peer ID MED AS Path Stale
0.0.0.0/0 169.254.20.1 None 64500 172.20.2.178 None [64500, 64510] None
240.0.0.0/4 169.254.40.1 Some(50) 64502 172.20.2.189 None [64502, 64520] None
root@oxz_switch:~# mgadm bgp status selected 65547
BGP Routes
=============
Prefix Nexthop Local Pref Origin AS Peer ID MED AS Path Stale
0.0.0.0/0 169.254.20.1 None 64500 172.20.2.178 None [64500, 64510] None
240.0.0.0/4 169.254.40.1 Some(50) 64502 172.20.2.189 None [64502, 64520] None
root@oxz_switch:~# mgadm bgp status neighbors 65547
Peer Address Peer ASN State State Duration Hold Keepalive
169.254.40.1 Some(64502) Established 6m 19s 121ms 6s/6s 2s/2s
169.254.20.1 Some(64500) Established 3days 4h 37m 7s 753ms 6s/6s 2s/2s
Initial state reported by CLI output:
$ oxide system networking bgp show-status
switch0
=======
Peer Address Local ASN Remote ASN Session State State Duration
169.254.10.1 65547 64500 Established 3days 5h 13m 17s 913ms
169.254.30.1 65547 64502 Established 3days 5h 13m 17s 917ms
switch1
=======
Peer Address Local ASN Remote ASN Session State State Duration
169.254.20.1 65547 64500 Established 3days 4h 37m 28s 229ms
169.254.40.1 65547 64502 Established 6m 39s 596ms
$ oxide system networking switch-port-settings show
switch0/qsfp0
=============
Autoneg Fec Speed
false None Speed100G
Address Lot VLAN
169.254.10.2/30 initial-infra None
169.254.30.2/30 initial-infra None
BGP Peer Config Export Import Communities Connect Retry Delay Open Enforce First AS Hold Time Idle Hold Time Keepalive Local Pref Md5 Auth Min TTL MED Remote ASN VLAN
169.254.10.1 as65547 [no filtering] [no filtering] [] 3 3 false 6 0 2 None None None None None None
169.254.30.1 as65547 [no filtering] [no filtering] [] 3 3 false 6 0 2 None None None None None None
Destination Nexthop Vlan Preference
switch1/qsfp0
=============
Autoneg Fec Speed
false None Speed100G
Address Lot VLAN
169.254.20.2/30 initial-infra None
169.254.40.2/30 initial-infra None
BGP Peer Config Export Import Communities Connect Retry Delay Open Enforce First AS Hold Time Idle Hold Time Keepalive Local Pref Md5 Auth Min TTL MED Remote ASN VLAN
169.254.20.1 as65547 [no filtering] [no filtering] [] 3 3 false 6 3 2 None None None None None None
169.254.40.1 as65547 [no filtering] [no filtering] [] 0 0 false 6 0 2 Some(50) None None None None None
Destination Nexthop Vlan Preference
Then I deleted the bgp peer.
$ oxide system networking bgp peer delete --rack 2d1fe615-0fd9-485d-8898-cd024883d222 --switch switch1 --port qsfp0 --addr 169.254.40.1
Resulting state on g3 (routes did not go away for first two commands):
root@oxz_switch:~# mgadm bgp status imported 65547
BGP Routes
=============
Prefix Nexthop Local Pref Origin AS Peer ID MED AS Path Stale
0.0.0.0/0 169.254.20.1 None 64500 172.20.2.178 None [64500, 64510] None
240.0.0.0/4 169.254.40.1 Some(50) 64502 172.20.2.189 None [64502, 64520] None
root@oxz_switch:~# mgadm bgp status selected 65547
BGP Routes
=============
Prefix Nexthop Local Pref Origin AS Peer ID MED AS Path Stale
0.0.0.0/0 169.254.20.1 None 64500 172.20.2.178 None [64500, 64510] None
240.0.0.0/4 169.254.40.1 Some(50) 64502 172.20.2.189 None [64502, 64520] None
root@oxz_switch:~# mgadm bgp status neighbors 65547
Peer Address Peer ASN State State Duration Hold Keepalive
169.254.20.1 Some(64500) Established 3days 4h 39m 31s 41ms 6s/6s 2s/2s
Resulting state reported by CLI output:
$ oxide system networking bgp show-status
switch0
=======
Peer Address Local ASN Remote ASN Session State State Duration
169.254.10.1 65547 64500 Established 3days 5h 15m 58s 986ms
169.254.30.1 65547 64502 Established 3days 5h 15m 58s 989ms
switch1
=======
Peer Address Local ASN Remote ASN Session State State Duration
169.254.20.1 65547 64500 Established 3days 4h 40m 9s 311ms
$ oxide system networking switch-port-settings show
switch0/qsfp0
=============
Autoneg Fec Speed
false None Speed100G
Address Lot VLAN
169.254.10.2/30 initial-infra None
169.254.30.2/30 initial-infra None
BGP Peer Config Export Import Communities Connect Retry Delay Open Enforce First AS Hold Time Idle Hold Time Keepalive Local Pref Md5 Auth Min TTL MED Remote ASN VLAN
169.254.10.1 as65547 [no filtering] [no filtering] [] 3 3 false 6 0 2 None None None None None None
169.254.30.1 as65547 [no filtering] [no filtering] [] 3 3 false 6 0 2 None None None None None None
Destination Nexthop Vlan Preference
switch1/qsfp0
=============
Autoneg Fec Speed
false None Speed100G
Address Lot VLAN
169.254.20.2/30 initial-infra None
169.254.40.2/30 initial-infra None
BGP Peer Config Export Import Communities Connect Retry Delay Open Enforce First AS Hold Time Idle Hold Time Keepalive Local Pref Md5 Auth Min TTL MED Remote ASN VLAN
169.254.20.1 as65547 [no filtering] [no filtering] [] 3 3 false 6 3 2 None None None None None None
Destination Nexthop Vlan Preference
I was using a4x2 with:
98f000bfc68b375e49888ff6946ffa988092084e
with merge from https://github.com/oxidecomputer/oxide.rs/pull/808 commit 457db93dc13a09d45df09597dd0a3c1becd0b086
, resolved merge conflicts by hand.I did a similar scenario as in my last comment, where I deleted the bgp peer (which had local pref 50), but this time I added it back (with no local pref).
Initial state on g3 before bgp peer delete (serves requests from Oxide CLI):
root@oxz_switch:~# mgadm bgp status imported 65547
BGP Routes
=============
Prefix Nexthop Local Pref Origin AS Peer ID MED AS Path Stale
0.0.0.0/0 169.254.20.1 None 64500 172.20.2.178 None [64500, 64510] None
240.0.0.0/4 169.254.40.1 Some(50) 64502 172.20.2.189 None [64502, 64520] None
root@oxz_switch:~# mgadm bgp status selected 65547
BGP Routes
=============
Prefix Nexthop Local Pref Origin AS Peer ID MED AS Path Stale
0.0.0.0/0 169.254.20.1 None 64500 172.20.2.178 None [64500, 64510] None
240.0.0.0/4 169.254.40.1 Some(50) 64502 172.20.2.189 None [64502, 64520] None
root@oxz_switch:~# mgadm bgp status neighbors 65547
Peer Address Peer ASN State State Duration Hold Keepalive
169.254.40.1 Some(64502) Established 12h 20m 24s 800ms 6s/6s 2s/2s
169.254.20.1 Some(64500) Established 3days 17h 14m 42s 114ms 6s/6s 2s/2s
Initial state reported by CLI output before bgp peer delete:
$ oxide system networking bgp show-status
switch0
=======
Peer Address Local ASN Remote ASN Session State State Duration
169.254.10.1 65547 64500 Established 3days 17h 51m 15s 663ms
169.254.30.1 65547 64502 Established 3days 17h 51m 15s 667ms
switch1
=======
Peer Address Local ASN Remote ASN Session State State Duration
169.254.40.1 65547 64502 Established 12h 21m 10s 571ms
169.254.20.1 65547 64500 Established 3days 17h 15m 27s 886ms
$ oxide system networking switch-port-settings show
switch0/qsfp0
=============
Autoneg Fec Speed
false None Speed100G
Address Lot VLAN
169.254.10.2/30 initial-infra None
169.254.30.2/30 initial-infra None
BGP Peer Config Export Import Communities Connect Retry Delay Open Enforce First AS Hold Time Idle Hold Time Keepalive Local Pref Md5 Auth Min TTL MED Remote ASN VLAN
169.254.10.1 as65547 [no filtering] [no filtering] [] 3 3 false 6 0 2 None None None None None None
169.254.30.1 as65547 [no filtering] [no filtering] [] 3 3 false 6 0 2 None None None None None None
Destination Nexthop Vlan Preference
switch1/qsfp0
=============
Autoneg Fec Speed
false None Speed100G
Address Lot VLAN
169.254.20.2/30 initial-infra None
169.254.40.2/30 initial-infra None
BGP Peer Config Export Import Communities Connect Retry Delay Open Enforce First AS Hold Time Idle Hold Time Keepalive Local Pref Md5 Auth Min TTL MED Remote ASN VLAN
169.254.20.1 as65547 [no filtering] [no filtering] [] 3 3 false 6 3 2 None None None None None None
169.254.40.1 as65547 [no filtering] [no filtering] [] 0 0 false 6 0 2 Some(50) None None None None None
Destination Nexthop Vlan Preference
Delete the bgp peer.
$ oxide system networking bgp peer delete --rack 2d1fe615-0fd9-485d-8898-cd024883d222 --switch switch1 --port qsfp0 --addr 169.254.40.1
Resulting state on g3 after bgp peer delete:
root@oxz_switch:~# mgadm bgp status imported 65547
BGP Routes
=============
Prefix Nexthop Local Pref Origin AS Peer ID MED AS Path Stale
0.0.0.0/0 169.254.20.1 None 64500 172.20.2.178 None [64500, 64510] None
240.0.0.0/4 169.254.40.1 Some(50) 64502 172.20.2.189 None [64502, 64520] None
root@oxz_switch:~# mgadm bgp status selected 65547
BGP Routes
=============
Prefix Nexthop Local Pref Origin AS Peer ID MED AS Path Stale
0.0.0.0/0 169.254.20.1 None 64500 172.20.2.178 None [64500, 64510] None
240.0.0.0/4 169.254.40.1 Some(50) 64502 172.20.2.189 None [64502, 64520] None
root@oxz_switch:~# mgadm bgp status neighbors 65547
Peer Address Peer ASN State State Duration Hold Keepalive
169.254.20.1 Some(64500) Established 3days 17h 17m 3s 469ms 6s/6s 2s/2s
Resulting state reported by CLI output after bgp peer delete:
$ oxide system networking bgp show-status
switch0
=======
Peer Address Local ASN Remote ASN Session State State Duration
169.254.10.1 65547 64500 Established 3days 17h 53m 4s 856ms
169.254.30.1 65547 64502 Established 3days 17h 53m 4s 860ms
switch1
=======
Peer Address Local ASN Remote ASN Session State State Duration
169.254.20.1 65547 64500 Established 3days 17h 17m 17s 87ms
$ oxide system networking switch-port-settings show
switch0/qsfp0
=============
Autoneg Fec Speed
false None Speed100G
Address Lot VLAN
169.254.10.2/30 initial-infra None
169.254.30.2/30 initial-infra None
BGP Peer Config Export Import Communities Connect Retry Delay Open Enforce First AS Hold Time Idle Hold Time Keepalive Local Pref Md5 Auth Min TTL MED Remote ASN VLAN
169.254.10.1 as65547 [no filtering] [no filtering] [] 3 3 false 6 0 2 None None None None None None
169.254.30.1 as65547 [no filtering] [no filtering] [] 3 3 false 6 0 2 None None None None None None
Destination Nexthop Vlan Preference
switch1/qsfp0
=============
Autoneg Fec Speed
false None Speed100G
Address Lot VLAN
169.254.20.2/30 initial-infra None
169.254.40.2/30 initial-infra None
BGP Peer Config Export Import Communities Connect Retry Delay Open Enforce First AS Hold Time Idle Hold Time Keepalive Local Pref Md5 Auth Min TTL MED Remote ASN VLAN
169.254.20.1 as65547 [no filtering] [no filtering] [] 3 3 false 6 3 2 None None None None None None
Destination Nexthop Vlan Preference
Now add the BGP peer back without local pref.
$ oxide system networking bgp peer set --rack 2d1fe615-0fd9-485d-8898-cd024883d222 --switch switch1 --port qsfp0 --addr 169.254.40.1 --bgp-config c722f444-9e03-4c07-8a58-89ed53c43dd5
Resulting state on g3 after bgp peer set (still has the old route in first two commands, and it has selected the old route in the second command):
root@oxz_switch:~# mgadm bgp status imported 65547
BGP Routes
=============
Prefix Nexthop Local Pref Origin AS Peer ID MED AS Path Stale
0.0.0.0/0 169.254.20.1 None 64500 172.20.2.178 None [64500, 64510] None
240.0.0.0/4 169.254.40.1 None 64502 172.20.2.189 None [64502, 64520] None
240.0.0.0/4 169.254.40.1 Some(50) 64502 172.20.2.189 None [64502, 64520] None
root@oxz_switch:~# mgadm bgp status selected 65547
BGP Routes
=============
Prefix Nexthop Local Pref Origin AS Peer ID MED AS Path Stale
0.0.0.0/0 169.254.20.1 None 64500 172.20.2.178 None [64500, 64510] None
240.0.0.0/4 169.254.40.1 Some(50) 64502 172.20.2.189 None [64502, 64520] None
root@oxz_switch:~# mgadm bgp status neighbors 65547
Peer Address Peer ASN State State Duration Hold Keepalive
169.254.40.1 Some(64502) Established 4s 715ms 6s/6s 2s/2s
169.254.20.1 Some(64500) Established 3days 17h 19m 10s 676ms 6s/6s 2s/2s
Resulting state reported by CLI output after bgp peer set:
$ oxide system networking bgp show-status
switch0
=======
Peer Address Local ASN Remote ASN Session State State Duration
169.254.10.1 65547 64500 Established 3days 17h 55m 23s 384ms
169.254.30.1 65547 64502 Established 3days 17h 55m 23s 387ms
switch1
=======
Peer Address Local ASN Remote ASN Session State State Duration
169.254.20.1 65547 64500 Established 3days 17h 19m 35s 620ms
169.254.40.1 65547 64502 Established 29s 658ms
$ oxide system networking switch-port-settings show
switch0/qsfp0
=============
Autoneg Fec Speed
false None Speed100G
Address Lot VLAN
169.254.10.2/30 initial-infra None
169.254.30.2/30 initial-infra None
BGP Peer Config Export Import Communities Connect Retry Delay Open Enforce First AS Hold Time Idle Hold Time Keepalive Local Pref Md5 Auth Min TTL MED Remote ASN VLAN
169.254.10.1 as65547 [no filtering] [no filtering] [] 3 3 false 6 0 2 None None None None None None
169.254.30.1 as65547 [no filtering] [no filtering] [] 3 3 false 6 0 2 None None None None None None
Destination Nexthop Vlan Preference
switch1/qsfp0
=============
Autoneg Fec Speed
false None Speed100G
Address Lot VLAN
169.254.20.2/30 initial-infra None
169.254.40.2/30 initial-infra None
BGP Peer Config Export Import Communities Connect Retry Delay Open Enforce First AS Hold Time Idle Hold Time Keepalive Local Pref Md5 Auth Min TTL MED Remote ASN VLAN
169.254.20.1 as65547 [no filtering] [no filtering] [] 3 3 false 6 3 2 None None None None None None
169.254.40.1 as65547 [no filtering] [no filtering] [] 0 0 false 6 0 2 None None None None None None
Destination Nexthop Vlan Preference
When a peer is administratively removed, the routes it has imported should be removed from the RIB. However, this does not appear to be happening. We have logic to remove peer imports from the RIB when a session exits the established state, but I think we're likely missing some logic for administrative removal.