oxidecomputer / maghemite

A routing stack written in Rust.
Mozilla Public License 2.0
35 stars 2 forks source link

Administrative removal of peers does not remove prefixes in RIB imported by that peer #349

Closed rcgoodfellow closed 1 month ago

rcgoodfellow commented 2 months ago

When a peer is administratively removed, the routes it has imported should be removed from the RIB. However, this does not appear to be happening. We have logic to remove peer imports from the RIB when a session exits the established state, but I think we're likely missing some logic for administrative removal.

elaine-oxide commented 2 months ago

I was using a4x2 with:

Initial state on g3 (serves requests from Oxide CLI):

root@oxz_switch:~# mgadm bgp status imported 65547
BGP Routes
=============
Prefix       Nexthop       Local Pref  Origin AS  Peer ID       MED   AS Path         Stale
0.0.0.0/0    169.254.20.1  None        64500      172.20.2.178  None  [64500, 64510]  None
240.0.0.0/4  169.254.40.1  Some(50)    64502      172.20.2.189  None  [64502, 64520]  None

root@oxz_switch:~# mgadm bgp status selected 65547
BGP Routes
=============
Prefix       Nexthop       Local Pref  Origin AS  Peer ID       MED   AS Path         Stale
0.0.0.0/0    169.254.20.1  None        64500      172.20.2.178  None  [64500, 64510]  None
240.0.0.0/4  169.254.40.1  Some(50)    64502      172.20.2.189  None  [64502, 64520]  None

root@oxz_switch:~# mgadm bgp status neighbors 65547
Peer Address  Peer ASN     State        State Duration         Hold   Keepalive
169.254.40.1  Some(64502)  Established  6m 19s 121ms           6s/6s  2s/2s
169.254.20.1  Some(64500)  Established  3days 4h 37m 7s 753ms  6s/6s  2s/2s

Initial state reported by CLI output:

$ oxide system networking bgp show-status
switch0
=======
Peer Address  Local ASN  Remote ASN  Session State  State Duration
169.254.10.1  65547      64500       Established    3days 5h 13m 17s 913ms
169.254.30.1  65547      64502       Established    3days 5h 13m 17s 917ms

switch1
=======
Peer Address  Local ASN  Remote ASN  Session State  State Duration
169.254.20.1  65547      64500       Established    3days 4h 37m 28s 229ms
169.254.40.1  65547      64502       Established    6m 39s 596ms

$ oxide system networking switch-port-settings show
switch0/qsfp0
=============
Autoneg  Fec   Speed
false    None  Speed100G

Address          Lot            VLAN
169.254.10.2/30  initial-infra  None
169.254.30.2/30  initial-infra  None

BGP Peer      Config   Export          Import          Communities  Connect Retry  Delay Open  Enforce First AS  Hold Time  Idle Hold Time  Keepalive  Local Pref  Md5 Auth  Min TTL  MED   Remote ASN  VLAN
169.254.10.1  as65547  [no filtering]  [no filtering]  []           3              3           false             6          0               2          None        None      None     None  None        None
169.254.30.1  as65547  [no filtering]  [no filtering]  []           3              3           false             6          0               2          None        None      None     None  None        None

Destination  Nexthop  Vlan  Preference

switch1/qsfp0
=============
Autoneg  Fec   Speed
false    None  Speed100G

Address          Lot            VLAN
169.254.20.2/30  initial-infra  None
169.254.40.2/30  initial-infra  None

BGP Peer      Config   Export          Import          Communities  Connect Retry  Delay Open  Enforce First AS  Hold Time  Idle Hold Time  Keepalive  Local Pref  Md5 Auth  Min TTL  MED   Remote ASN  VLAN
169.254.20.1  as65547  [no filtering]  [no filtering]  []           3              3           false             6          3               2          None        None      None     None  None        None
169.254.40.1  as65547  [no filtering]  [no filtering]  []           0              0           false             6          0               2          Some(50)    None      None     None  None        None

Destination  Nexthop  Vlan  Preference

Then I deleted the bgp peer.

$ oxide system networking bgp peer delete --rack 2d1fe615-0fd9-485d-8898-cd024883d222 --switch switch1 --port qsfp0 --addr 169.254.40.1

Resulting state on g3 (routes did not go away for first two commands):

root@oxz_switch:~# mgadm bgp status imported 65547
BGP Routes
=============
Prefix       Nexthop       Local Pref  Origin AS  Peer ID       MED   AS Path         Stale
0.0.0.0/0    169.254.20.1  None        64500      172.20.2.178  None  [64500, 64510]  None
240.0.0.0/4  169.254.40.1  Some(50)    64502      172.20.2.189  None  [64502, 64520]  None

root@oxz_switch:~# mgadm bgp status selected 65547
BGP Routes
=============
Prefix       Nexthop       Local Pref  Origin AS  Peer ID       MED   AS Path         Stale
0.0.0.0/0    169.254.20.1  None        64500      172.20.2.178  None  [64500, 64510]  None
240.0.0.0/4  169.254.40.1  Some(50)    64502      172.20.2.189  None  [64502, 64520]  None

root@oxz_switch:~# mgadm bgp status neighbors 65547
Peer Address  Peer ASN     State        State Duration         Hold   Keepalive
169.254.20.1  Some(64500)  Established  3days 4h 39m 31s 41ms  6s/6s  2s/2s

Resulting state reported by CLI output:

$ oxide system networking bgp show-status
switch0
=======
Peer Address  Local ASN  Remote ASN  Session State  State Duration
169.254.10.1  65547      64500       Established    3days 5h 15m 58s 986ms
169.254.30.1  65547      64502       Established    3days 5h 15m 58s 989ms

switch1
=======
Peer Address  Local ASN  Remote ASN  Session State  State Duration
169.254.20.1  65547      64500       Established    3days 4h 40m 9s 311ms

$ oxide system networking switch-port-settings show
switch0/qsfp0
=============
Autoneg  Fec   Speed
false    None  Speed100G

Address          Lot            VLAN
169.254.10.2/30  initial-infra  None
169.254.30.2/30  initial-infra  None

BGP Peer      Config   Export          Import          Communities  Connect Retry  Delay Open  Enforce First AS  Hold Time  Idle Hold Time  Keepalive  Local Pref  Md5 Auth  Min TTL  MED   Remote ASN  VLAN
169.254.10.1  as65547  [no filtering]  [no filtering]  []           3              3           false             6          0               2          None        None      None     None  None        None
169.254.30.1  as65547  [no filtering]  [no filtering]  []           3              3           false             6          0               2          None        None      None     None  None        None

Destination  Nexthop  Vlan  Preference

switch1/qsfp0
=============
Autoneg  Fec   Speed
false    None  Speed100G

Address          Lot            VLAN
169.254.20.2/30  initial-infra  None
169.254.40.2/30  initial-infra  None

BGP Peer      Config   Export          Import          Communities  Connect Retry  Delay Open  Enforce First AS  Hold Time  Idle Hold Time  Keepalive  Local Pref  Md5 Auth  Min TTL  MED   Remote ASN  VLAN
169.254.20.1  as65547  [no filtering]  [no filtering]  []           3              3           false             6          3               2          None        None      None     None  None        None

Destination  Nexthop  Vlan  Preference
elaine-oxide commented 2 months ago

I was using a4x2 with:

I did a similar scenario as in my last comment, where I deleted the bgp peer (which had local pref 50), but this time I added it back (with no local pref).

Initial state on g3 before bgp peer delete (serves requests from Oxide CLI):

root@oxz_switch:~# mgadm bgp status imported 65547
BGP Routes
=============
Prefix       Nexthop       Local Pref  Origin AS  Peer ID       MED   AS Path         Stale
0.0.0.0/0    169.254.20.1  None        64500      172.20.2.178  None  [64500, 64510]  None
240.0.0.0/4  169.254.40.1  Some(50)    64502      172.20.2.189  None  [64502, 64520]  None

root@oxz_switch:~# mgadm bgp status selected 65547
BGP Routes
=============
Prefix       Nexthop       Local Pref  Origin AS  Peer ID       MED   AS Path         Stale
0.0.0.0/0    169.254.20.1  None        64500      172.20.2.178  None  [64500, 64510]  None
240.0.0.0/4  169.254.40.1  Some(50)    64502      172.20.2.189  None  [64502, 64520]  None

root@oxz_switch:~# mgadm bgp status neighbors 65547
Peer Address  Peer ASN     State        State Duration           Hold   Keepalive
169.254.40.1  Some(64502)  Established  12h 20m 24s 800ms        6s/6s  2s/2s
169.254.20.1  Some(64500)  Established  3days 17h 14m 42s 114ms  6s/6s  2s/2s

Initial state reported by CLI output before bgp peer delete:

$ oxide system networking bgp show-status
switch0
=======
Peer Address  Local ASN  Remote ASN  Session State  State Duration
169.254.10.1  65547      64500       Established    3days 17h 51m 15s 663ms
169.254.30.1  65547      64502       Established    3days 17h 51m 15s 667ms

switch1
=======
Peer Address  Local ASN  Remote ASN  Session State  State Duration
169.254.40.1  65547      64502       Established    12h 21m 10s 571ms
169.254.20.1  65547      64500       Established    3days 17h 15m 27s 886ms

$ oxide system networking switch-port-settings show
switch0/qsfp0
=============
Autoneg  Fec   Speed
false    None  Speed100G

Address          Lot            VLAN
169.254.10.2/30  initial-infra  None
169.254.30.2/30  initial-infra  None

BGP Peer      Config   Export          Import          Communities  Connect Retry  Delay Open  Enforce First AS  Hold Time  Idle Hold Time  Keepalive  Local Pref  Md5 Auth  Min TTL  MED   Remote ASN  VLAN
169.254.10.1  as65547  [no filtering]  [no filtering]  []           3              3           false             6          0               2          None        None      None     None  None        None
169.254.30.1  as65547  [no filtering]  [no filtering]  []           3              3           false             6          0               2          None        None      None     None  None        None

Destination  Nexthop  Vlan  Preference

switch1/qsfp0
=============
Autoneg  Fec   Speed
false    None  Speed100G

Address          Lot            VLAN
169.254.20.2/30  initial-infra  None
169.254.40.2/30  initial-infra  None

BGP Peer      Config   Export          Import          Communities  Connect Retry  Delay Open  Enforce First AS  Hold Time  Idle Hold Time  Keepalive  Local Pref  Md5 Auth  Min TTL  MED   Remote ASN  VLAN
169.254.20.1  as65547  [no filtering]  [no filtering]  []           3              3           false             6          3               2          None        None      None     None  None        None
169.254.40.1  as65547  [no filtering]  [no filtering]  []           0              0           false             6          0               2          Some(50)    None      None     None  None        None

Destination  Nexthop  Vlan  Preference

Delete the bgp peer.

$ oxide system networking bgp peer delete --rack 2d1fe615-0fd9-485d-8898-cd024883d222 --switch switch1 --port qsfp0 --addr 169.254.40.1

Resulting state on g3 after bgp peer delete:

root@oxz_switch:~# mgadm bgp status imported 65547
BGP Routes
=============
Prefix       Nexthop       Local Pref  Origin AS  Peer ID       MED   AS Path         Stale
0.0.0.0/0    169.254.20.1  None        64500      172.20.2.178  None  [64500, 64510]  None
240.0.0.0/4  169.254.40.1  Some(50)    64502      172.20.2.189  None  [64502, 64520]  None

root@oxz_switch:~# mgadm bgp status selected 65547
BGP Routes
=============
Prefix       Nexthop       Local Pref  Origin AS  Peer ID       MED   AS Path         Stale
0.0.0.0/0    169.254.20.1  None        64500      172.20.2.178  None  [64500, 64510]  None
240.0.0.0/4  169.254.40.1  Some(50)    64502      172.20.2.189  None  [64502, 64520]  None

root@oxz_switch:~# mgadm bgp status neighbors 65547
Peer Address  Peer ASN     State        State Duration          Hold   Keepalive
169.254.20.1  Some(64500)  Established  3days 17h 17m 3s 469ms  6s/6s  2s/2s

Resulting state reported by CLI output after bgp peer delete:

$ oxide system networking bgp show-status
switch0
=======
Peer Address  Local ASN  Remote ASN  Session State  State Duration
169.254.10.1  65547      64500       Established    3days 17h 53m 4s 856ms
169.254.30.1  65547      64502       Established    3days 17h 53m 4s 860ms

switch1
=======
Peer Address  Local ASN  Remote ASN  Session State  State Duration
169.254.20.1  65547      64500       Established    3days 17h 17m 17s 87ms

$ oxide system networking switch-port-settings show
switch0/qsfp0
=============
Autoneg  Fec   Speed
false    None  Speed100G

Address          Lot            VLAN
169.254.10.2/30  initial-infra  None
169.254.30.2/30  initial-infra  None

BGP Peer      Config   Export          Import          Communities  Connect Retry  Delay Open  Enforce First AS  Hold Time  Idle Hold Time  Keepalive  Local Pref  Md5 Auth  Min TTL  MED   Remote ASN  VLAN
169.254.10.1  as65547  [no filtering]  [no filtering]  []           3              3           false             6          0               2          None        None      None     None  None        None
169.254.30.1  as65547  [no filtering]  [no filtering]  []           3              3           false             6          0               2          None        None      None     None  None        None

Destination  Nexthop  Vlan  Preference

switch1/qsfp0
=============
Autoneg  Fec   Speed
false    None  Speed100G

Address          Lot            VLAN
169.254.20.2/30  initial-infra  None
169.254.40.2/30  initial-infra  None

BGP Peer      Config   Export          Import          Communities  Connect Retry  Delay Open  Enforce First AS  Hold Time  Idle Hold Time  Keepalive  Local Pref  Md5 Auth  Min TTL  MED   Remote ASN  VLAN
169.254.20.1  as65547  [no filtering]  [no filtering]  []           3              3           false             6          3               2          None        None      None     None  None        None

Destination  Nexthop  Vlan  Preference

Now add the BGP peer back without local pref.

$ oxide system networking bgp peer set --rack 2d1fe615-0fd9-485d-8898-cd024883d222 --switch switch1 --port qsfp0 --addr 169.254.40.1 --bgp-config c722f444-9e03-4c07-8a58-89ed53c43dd5

Resulting state on g3 after bgp peer set (still has the old route in first two commands, and it has selected the old route in the second command):

root@oxz_switch:~# mgadm bgp status imported 65547
BGP Routes
=============
Prefix       Nexthop       Local Pref  Origin AS  Peer ID       MED   AS Path         Stale
0.0.0.0/0    169.254.20.1  None        64500      172.20.2.178  None  [64500, 64510]  None
240.0.0.0/4  169.254.40.1  None        64502      172.20.2.189  None  [64502, 64520]  None
240.0.0.0/4  169.254.40.1  Some(50)    64502      172.20.2.189  None  [64502, 64520]  None

root@oxz_switch:~# mgadm bgp status selected 65547
BGP Routes
=============
Prefix       Nexthop       Local Pref  Origin AS  Peer ID       MED   AS Path         Stale
0.0.0.0/0    169.254.20.1  None        64500      172.20.2.178  None  [64500, 64510]  None
240.0.0.0/4  169.254.40.1  Some(50)    64502      172.20.2.189  None  [64502, 64520]  None

root@oxz_switch:~# mgadm bgp status neighbors 65547
Peer Address  Peer ASN     State        State Duration           Hold   Keepalive
169.254.40.1  Some(64502)  Established  4s 715ms                 6s/6s  2s/2s
169.254.20.1  Some(64500)  Established  3days 17h 19m 10s 676ms  6s/6s  2s/2s

Resulting state reported by CLI output after bgp peer set:

$ oxide system networking bgp show-status
switch0
=======
Peer Address  Local ASN  Remote ASN  Session State  State Duration
169.254.10.1  65547      64500       Established    3days 17h 55m 23s 384ms
169.254.30.1  65547      64502       Established    3days 17h 55m 23s 387ms

switch1
=======
Peer Address  Local ASN  Remote ASN  Session State  State Duration
169.254.20.1  65547      64500       Established    3days 17h 19m 35s 620ms
169.254.40.1  65547      64502       Established    29s 658ms

$ oxide system networking switch-port-settings show
switch0/qsfp0
=============
Autoneg  Fec   Speed
false    None  Speed100G

Address          Lot            VLAN
169.254.10.2/30  initial-infra  None
169.254.30.2/30  initial-infra  None

BGP Peer      Config   Export          Import          Communities  Connect Retry  Delay Open  Enforce First AS  Hold Time  Idle Hold Time  Keepalive  Local Pref  Md5 Auth  Min TTL  MED   Remote ASN  VLAN
169.254.10.1  as65547  [no filtering]  [no filtering]  []           3              3           false             6          0               2          None        None      None     None  None        None
169.254.30.1  as65547  [no filtering]  [no filtering]  []           3              3           false             6          0               2          None        None      None     None  None        None

Destination  Nexthop  Vlan  Preference

switch1/qsfp0
=============
Autoneg  Fec   Speed
false    None  Speed100G

Address          Lot            VLAN
169.254.20.2/30  initial-infra  None
169.254.40.2/30  initial-infra  None

BGP Peer      Config   Export          Import          Communities  Connect Retry  Delay Open  Enforce First AS  Hold Time  Idle Hold Time  Keepalive  Local Pref  Md5 Auth  Min TTL  MED   Remote ASN  VLAN
169.254.20.1  as65547  [no filtering]  [no filtering]  []           3              3           false             6          3               2          None        None      None     None  None        None
169.254.40.1  as65547  [no filtering]  [no filtering]  []           0              0           false             6          0               2          None        None      None     None  None        None

Destination  Nexthop  Vlan  Preference