FRRouting / frr

The FRRouting Protocol Suite
https://frrouting.org/
Other
3.33k stars 1.25k forks source link

IBGP network for loopback not considered valid route when set as neighbor #16877

Closed snuffy22 closed 1 week ago

snuffy22 commented 1 month ago

Description

Something changed after FRR 7.5 this was the last version that iBGP loopback peers could correctly be considered valid as a network on the receiving device.

Making simplest example, R1 is connected directly to R2 via eth1

In the below example R2 (10.0.0.2) is the BGP neighbor of R1.

As we can see, it considers R2 neighbor address as inaccessible, but its other advertised address of 10.0.0.5/32 is fine.

 r1# sh ip bgp
 BGP table version is 2, local router ID is 10.0.0.1, vrf id 0
 Default local pref 100, local AS 65002
 Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
 Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
 Origin codes:  i - IGP, e - EGP, ? - incomplete
 RPKI validation codes: V valid, I invalid, N Not found

     Network          Next Hop            Metric LocPrf Weight Path
  *> 10.0.0.1/32      0.0.0.0(r1)              0         32768 i
    i10.0.0.2/32      10.0.0.2(r2)             0    100      0 i
  *>i10.0.0.5/32      10.0.0.2(r2)             0    100      0 i

 Displayed 3 routes and 3 total paths
r1# sh ip bgp 10.0.0.2
BGP routing table entry for 10.0.0.2/32, version 0
Paths: (1 available, no best path)
  Not advertised to any peer
  Local
    10.0.0.2(r2) (inaccessible, import-check enabled) from r2(10.0.0.2) (10.0.0.2)
      Origin IGP, metric 0, localpref 100, invalid, internal
      Last update: Thu Sep 19 12:27:22 2024
r1#
r1# sh ip bgp 10.0.0.5
BGP routing table entry for 10.0.0.5/32, version 2
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  Local
    10.0.0.2(r2) from r2(10.0.0.2) (10.0.0.2)
      Origin IGP, metric 0, localpref 100, valid, internal, bestpath-from-AS Local, best (First path received)
      Last update: Thu Sep 19 12:28:47 2024
r1#
r1# sh ip bgp summ

IPv4 Unicast Summary:
BGP router identifier 10.0.0.1, local AS number 65002 VRF default vrf-id 0
BGP table version 2
RIB entries 5, using 480 bytes of memory
Peers 1, using 13 KiB of memory

Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
r2(10.0.0.2)    4      65002     15846     15845        2    0    0 13:12:05            2        1 r2

Total number of neighbors 1
r1#

Version

r1# sh ver
FRRouting 10.0.1_git (r1) on Linux(6.8.0-35-generic).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
configured with:
    '--prefix=/usr' '--sysconfdir=/etc' '--localstatedir=/var' '--sbindir=/usr/lib/frr' '--libdir=/usr/lib' '--enable-rpki' '--enable-vtysh' '--enable-multipath=64' '--enable-vty-group=frrvty' '--enable-user=frr' '--enable-group=frr' '--enable-pcre2posix' '--enable-scripting' 'CC=gcc' 'CXX=g++'

How to reproduce

  1. Create a 2 router network with iBGP
  2. Set both hosts to use loopbacks to communicate to other via BGP neighbor
  3. Set a static IP route to access loopback of other device
  4. The BGP session will come up fine
  5. You will notice that the loopback IP set as the neighbor is not considered reachable, but any other network advertised will be
r1# sh run
Building configuration...

Current configuration:
!
frr version 10.0.1_git
frr defaults datacenter
hostname r1
no ipv6 forwarding
service integrated-vtysh-config
!
ip route 10.0.0.2/32 eth1
!
vrf mgmt
exit-vrf
!
interface eth1
 description r1 -> r2
 ip address 10.1.0.1/30
exit
!
interface lo
 ip address 10.0.0.1/32
exit
!
router bgp 65002
 bgp router-id 10.0.0.1
 no bgp default ipv4-unicast
 bgp bestpath as-path multipath-relax
 neighbor 10.0.0.2 remote-as 65002
 neighbor 10.0.0.2 description r2
 neighbor 10.0.0.2 update-source lo
 !
 address-family ipv4 unicast
  network 10.0.0.1/32
  neighbor 10.0.0.2 activate
  neighbor 10.0.0.2 next-hop-self
 exit-address-family
exit
!
end
r2# sh run
Building configuration...

Current configuration:
!
frr version 10.0.1_git
frr defaults datacenter
hostname r2
no ipv6 forwarding
service integrated-vtysh-config
!
ip route 10.0.0.1/32 eth1
!
vrf mgmt
exit-vrf
!
interface eth1
 description r2 -> r1
 ip address 10.1.0.2/30
exit
!
interface lo
 ip address 10.0.0.2/32
 ip address 10.0.0.5/32
exit
!
router bgp 65002
 bgp router-id 10.0.0.2
 no bgp default ipv4-unicast
 bgp bestpath as-path multipath-relax
 neighbor 10.0.0.1 remote-as 65002
 neighbor 10.0.0.1 description r1
 neighbor 10.0.0.1 update-source lo
 !
 address-family ipv4 unicast
  network 10.0.0.2/32
  network 10.0.0.5/32
  neighbor 10.0.0.1 activate
  neighbor 10.0.0.1 next-hop-self
 exit-address-family
exit
!
end
r2#

Expected behavior

The BGP route for the loopback of the neighbor should be considered valid to be advertised by R1.

Actual behavior

Loopback set as BGP neighbor is considered 'unreachable'

r1# sh ip bgp 10.0.0.2
BGP routing table entry for 10.0.0.2/32, version 0
Paths: (1 available, no best path)
  Not advertised to any peer
  Local
    10.0.0.2(r2) (inaccessible, import-check enabled) from r2(10.0.0.2) (10.0.0.2)
      Origin IGP, metric 0, localpref 100, invalid, internal
      Last update: Thu Sep 19 12:27:22 2024
r1#
r2# sh ip bgp 10.0.0.1/32
BGP routing table entry for 10.0.0.1/32, version 0
Paths: (1 available, no best path)
  Not advertised to any peer
  Local
    10.0.0.1(r1) (inaccessible, import-check enabled) from r1(10.0.0.1) (10.0.0.1)
      Origin IGP, metric 0, localpref 100, invalid, internal
      Last update: Thu Sep 19 12:27:22 2024
r2#

Additional context

No response

Checklist

donaldsharp commented 1 month ago

https://docs.frrouting.org/en/latest/bgp.html#clicmd-bgp-disable-ebgp-connected-route-check ?

snuffy22 commented 1 month ago

In this case, it is IBGP, that flag is for EBGP which we are not doing in this case.

ton31337 commented 4 weeks ago

Try "no bgp network import-check".

snuffy22 commented 4 weeks ago

This makes no difference, basically i'd expect if the 10.0.0.5/32 route works, the 10.0.0.2 should work.

As stated this last worked correctly in 7.5, unfortunately there seems to be some 'interaction' with 10.0.0.2 being a neighbor and being a valid route on 10.0.0.1 (r1).

ton31337 commented 4 weeks ago

Also... did you try enabling ebgp-multihop?

snuffy22 commented 3 weeks ago

This made no difference, as stated this is an IBGP neighbour/peer.

Cheers

ton31337 commented 3 weeks ago

Then could you enable debug logs and show what's happening? debug bgp neighbor, debug bgp udpate, debug bgp nht.

snuffy22 commented 3 weeks ago

Here is the output.

r1# debug bgp neigh
BGP neighbor-events debugging is on
r1# debug bgp update detail
BGP updates detail debugging is on
r1# debug bgp nht
BGP nexthop tracking debugging is on
r1# term mon
r1# clear ip bgp *

2024-09-24 04:30:19.276 [INFO] bgpd: [H7QV4-WR3ZD] %NOTIFICATION(Hard Reset): sent to neighbor 10.0.0.2 6/4 (Cease/Administrative Reset) ""
2024-09-24 04:30:19.276 [DEBG] bgpd: [ZWCSR-M7FG9] 10.0.0.2 [FSM] BGP_Stop (Established->Clearing), fd 26
2024-09-24 04:30:19.276 [INFO] bgpd: [PXVXG-TFNNT] %ADJCHANGE: neighbor 10.0.0.2(r2) in vrf default Down User reset
2024-09-24 04:30:19.277 [DEBG] bgpd: [V4R0W-D4WGF] 10.0.0.2(Unknown) Update Group Hash: sort: 1 sub_sort: 0 UpdGrpFlags: 0 UpdGrpAFFlags: 553648135
2024-09-24 04:30:19.277 [DEBG] bgpd: [NVVBY-K8MCE] 10.0.0.2(Unknown) Update Group Hash: addpath: 4 UpdGrpCapFlag: 256 UpdGrpCapAFFlag: 2048 route_adv: 0 change local as: 0, as_path_loop_detection: 0
2024-09-24 04:30:19.277 [DEBG] bgpd: [X4CQ0-63QKB] 10.0.0.2(Unknown) Update Group Hash: addpath paths-limit: (send 0, receive 0)
2024-09-24 04:30:19.277 [DEBG] bgpd: [Z8Q37-65KK3] 10.0.0.2(Unknown) Update Group Hash: max packet size: 65535 pmax_out: 0 Peer Group: (NONE) rmap out: (NONE)
2024-09-24 04:30:19.277 [DEBG] bgpd: [SM2F3-HRYKP] 10.0.0.2(Unknown) Update Group Hash: dlist out: (NONE) plist out: (NONE) aslist out: (NONE) usmap out: (NONE) advmap: (NONE) 0
2024-09-24 04:30:19.277 [DEBG] bgpd: [V8B3M-T6VFC] 10.0.0.2(Unknown) Update Group Hash: default rmap: (NONE) shared network and afi active network: 0
2024-09-24 04:30:19.277 [DEBG] bgpd: [Y5EX3-GHT5V] 10.0.0.2(Unknown) Update Group Hash: Lonesoul: 0 ORF prefix: 0 max prefix out: 0
2024-09-24 04:30:19.277 [DEBG] bgpd: [X19K7-9V4K2] 10.0.0.2(Unknown) Update Group Hash: local role: 255 AIGP: 0 SOO: (NONE)
2024-09-24 04:30:19.277 [DEBG] bgpd: [SQ314-QBJCR] 10.0.0.2(Unknown) Update Group Hash key: 241883886
2024-09-24 04:30:19.279 [DEBG] bgpd: [T91AW-FGMHW] bgp_fsm_change_status : vrf default(0), Status: Clearing established_peers 0
2024-09-24 04:30:19.279 [DEBG] bgpd: [HKWM3-ZC5QP] 10.0.0.2 fd -1 went from Established to Clearing
2024-09-24 04:30:19.289 [DEBG] bgpd: [ZWCSR-M7FG9] 10.0.0.2 [FSM] Clearing_Completed (Clearing->Idle), fd -1
2024-09-24 04:30:19.289 [DEBG] bgpd: [T91AW-FGMHW] bgp_fsm_change_status : vrf default(0), Status: Idle established_peers 0
2024-09-24 04:30:19.289 [DEBG] bgpd: [HKWM3-ZC5QP] 10.0.0.2 fd -1 went from Clearing to Idle
2024-09-24 04:30:21.290 [DEBG] bgpd: [ZQTB5-H8522] 10.0.0.2 [FSM] Timer (start timer expire).
2024-09-24 04:30:21.290 [DEBG] bgpd: [ZWCSR-M7FG9] 10.0.0.2 [FSM] BGP_Start (Idle->Connect), fd -1
2024-09-24 04:30:21.290 [DEBG] bgpd: [WNKP5-SN018] Found existing bnc 10.0.0.2/32(0)(VRF default) flags 0xb ifindex 0 #paths 0 peer 0x7b28e8b9dac0
2024-09-24 04:30:21.290 [DEBG] bgpd: [Z195V-FNKRK] 10.0.0.2 [Event] Connect start to 10.0.0.2 fd 26
2024-09-24 04:30:21.291 [DEBG] bgpd: [G0837-S7QES] 10.0.0.2 [FSM] Non blocking connect waiting result, fd 26
2024-09-24 04:30:21.291 [DEBG] bgpd: [T91AW-FGMHW] bgp_fsm_change_status : vrf default(0), Status: Connect established_peers 0
2024-09-24 04:30:21.291 [DEBG] bgpd: [HKWM3-ZC5QP] 10.0.0.2 fd 26 went from Idle to Connect
2024-09-24 04:30:21.291 [DEBG] bgpd: [T04AP-5W1P3] [Event] connection from 10.0.0.2 fd 28, active peer status 2 fd 26
2024-09-24 04:30:21.291 [DEBG] bgpd: [WNKP5-SN018] Found existing bnc 10.0.0.2/32(0)(VRF default) flags 0xb ifindex 0 #paths 0 peer 0x7b28e8b9dac0
2024-09-24 04:30:21.291 [DEBG] bgpd: [T91AW-FGMHW] bgp_fsm_change_status : vrf default(0), Status: Active established_peers 0
2024-09-24 04:30:21.291 [DEBG] bgpd: [HKWM3-ZC5QP] 10.0.0.2 fd 28 went from Idle to Active
2024-09-24 04:30:21.291 [DEBG] bgpd: [ZWCSR-M7FG9] 10.0.0.2 [FSM] TCP_connection_open (Active->OpenSent), fd 28
2024-09-24 04:30:21.291 [DEBG] bgpd: [WECS1-Q4P17] 10.0.0.2 passive open
2024-09-24 04:30:21.291 [DEBG] bgpd: [XKJ09-9VTZ7] 10.0.0.2 Sending hostname cap with hn = r1, dn = (null)
2024-09-24 04:30:21.291 [DEBG] bgpd: [YJSKD-N5GGC] 10.0.0.2 Sending Software Version cap, value: FRRouting/10.1.1_git
2024-09-24 04:30:21.291 [DEBG] bgpd: [JFFAN-DEGED] 10.0.0.2 sending OPEN, version 4, my as 65002, holdtime 9, id 10.0.0.1
2024-09-24 04:30:21.291 [DEBG] bgpd: [T91AW-FGMHW] bgp_fsm_change_status : vrf default(0), Status: OpenSent established_peers 0
2024-09-24 04:30:21.291 [DEBG] bgpd: [HKWM3-ZC5QP] 10.0.0.2 fd 28 went from Active to OpenSent
2024-09-24 04:30:21.291 [DEBG] bgpd: [ZWCSR-M7FG9] 10.0.0.2 [FSM] TCP_connection_open (Connect->OpenSent), fd 26
2024-09-24 04:30:21.291 [DEBG] bgpd: [RWZTG-AA74G] 10.0.0.2 open active, local address 10.0.0.1
2024-09-24 04:30:21.292 [DEBG] bgpd: [XKJ09-9VTZ7] 10.0.0.2 Sending hostname cap with hn = r1, dn = (null)
2024-09-24 04:30:21.292 [DEBG] bgpd: [YJSKD-N5GGC] 10.0.0.2 Sending Software Version cap, value: FRRouting/10.1.1_git
2024-09-24 04:30:21.292 [DEBG] bgpd: [JFFAN-DEGED] 10.0.0.2 sending OPEN, version 4, my as 65002, holdtime 9, id 10.0.0.1
2024-09-24 04:30:21.292 [DEBG] bgpd: [T91AW-FGMHW] bgp_fsm_change_status : vrf default(0), Status: OpenSent established_peers 0
2024-09-24 04:30:21.292 [DEBG] bgpd: [HKWM3-ZC5QP] 10.0.0.2 fd 26 went from Connect to OpenSent
2024-09-24 04:30:21.292 [DEBG] bgpd: [WNM1E-D314G] 10.0.0.2 rcv OPEN, version 4, remote-as (in open) 65002, holdtime 9, id 10.0.0.2
2024-09-24 04:30:21.292 [INFO] bgpd: [HZN6M-XRM1G] %NOTIFICATION: sent to neighbor 10.0.0.2 6/7 (Cease/Connection Collision Resolution) 0 bytes
2024-09-24 04:30:21.292 [DEBG] bgpd: [QG29C-5TSVS] 10.0.0.2 rcv OPEN w/ OPTION parameter len: 99
2024-09-24 04:30:21.292 [DEBG] bgpd: [NVZPF-5ST3B] 10.0.0.2 rcvd OPEN w/ optional parameter type 2 (Capability) len 6
2024-09-24 04:30:21.292 [DEBG] bgpd: [SCW43-WN4M1] 10.0.0.2 OPEN has MultiProtocol Extensions capability (1), length 4
2024-09-24 04:30:21.292 [DEBG] bgpd: [VXVH9-3MXR0] 10.0.0.2 OPEN has MultiProtocol Extensions capability for afi/safi: IPv4/unicast
2024-09-24 04:30:21.292 [DEBG] bgpd: [NVZPF-5ST3B] 10.0.0.2 rcvd OPEN w/ optional parameter type 2 (Capability) len 2
2024-09-24 04:30:21.292 [DEBG] bgpd: [SCW43-WN4M1] 10.0.0.2 OPEN has Route Refresh capability (2), length 0
2024-09-24 04:30:21.292 [DEBG] bgpd: [NVZPF-5ST3B] 10.0.0.2 rcvd OPEN w/ optional parameter type 2 (Capability) len 2
2024-09-24 04:30:21.292 [DEBG] bgpd: [SCW43-WN4M1] 10.0.0.2 OPEN has Enhanced Route Refresh capability (70), length 0
2024-09-24 04:30:21.292 [DEBG] bgpd: [NVZPF-5ST3B] 10.0.0.2 rcvd OPEN w/ optional parameter type 2 (Capability) len 6
2024-09-24 04:30:21.292 [DEBG] bgpd: [SCW43-WN4M1] 10.0.0.2 OPEN has 4-octet AS number capability (65), length 4
2024-09-24 04:30:21.292 [DEBG] bgpd: [NVZPF-5ST3B] 10.0.0.2 rcvd OPEN w/ optional parameter type 2 (Capability) len 2
2024-09-24 04:30:21.292 [DEBG] bgpd: [SCW43-WN4M1] 10.0.0.2 OPEN has BGP Extended Message capability (6), length 0
2024-09-24 04:30:21.292 [DEBG] bgpd: [NVZPF-5ST3B] 10.0.0.2 rcvd OPEN w/ optional parameter type 2 (Capability) len 6
2024-09-24 04:30:21.292 [DEBG] bgpd: [SCW43-WN4M1] 10.0.0.2 OPEN has AddPath capability (69), length 4
2024-09-24 04:30:21.292 [DEBG] bgpd: [SBSKM-G6QBW] 10.0.0.2 OPEN has AddPath capability for afi/safi: IPv4/unicast, receive
2024-09-24 04:30:21.292 [DEBG] bgpd: [NVZPF-5ST3B] 10.0.0.2 rcvd OPEN w/ optional parameter type 2 (Capability) len 7
2024-09-24 04:30:21.292 [DEBG] bgpd: [SCW43-WN4M1] 10.0.0.2 OPEN has Paths-Limit capability (76), length 5
2024-09-24 04:30:21.292 [DEBG] bgpd: [MSJ7S-E9MRG] 10.0.0.2 OPEN has Paths-Limit capability for afi/safi: IPv4/unicast limit: 0
2024-09-24 04:30:21.292 [DEBG] bgpd: [NVZPF-5ST3B] 10.0.0.2 rcvd OPEN w/ optional parameter type 2 (Capability) len 2
2024-09-24 04:30:21.292 [DEBG] bgpd: [SCW43-WN4M1] 10.0.0.2 OPEN has Dynamic capability (67), length 0
2024-09-24 04:30:21.292 [DEBG] bgpd: [NVZPF-5ST3B] 10.0.0.2 rcvd OPEN w/ optional parameter type 2 (Capability) len 6
2024-09-24 04:30:21.292 [DEBG] bgpd: [SCW43-WN4M1] 10.0.0.2 OPEN has FQDN capability (73), length 4
2024-09-24 04:30:21.292 [DEBG] bgpd: [MR6EG-NS6N7] 10.0.0.2 received hostname r2, domainname (null)
2024-09-24 04:30:21.292 [DEBG] bgpd: [NVZPF-5ST3B] 10.0.0.2 rcvd OPEN w/ optional parameter type 2 (Capability) len 4
2024-09-24 04:30:21.292 [DEBG] bgpd: [SCW43-WN4M1] 10.0.0.2 OPEN has Graceful Restart capability (64), length 2
2024-09-24 04:30:21.292 [DEBG] bgpd: [KDY57-R1CN0] 10.0.0.2 Peer has not restarted. Restart Time: 120, N-bit set: yes
2024-09-24 04:30:21.292 [DEBG] bgpd: [NVZPF-5ST3B] 10.0.0.2 rcvd OPEN w/ optional parameter type 2 (Capability) len 9
2024-09-24 04:30:21.292 [DEBG] bgpd: [SCW43-WN4M1] 10.0.0.2 OPEN has Long-lived BGP Graceful Restart capability (71), length 7
2024-09-24 04:30:21.292 [DEBG] bgpd: [ZVHDJ-HW40B] 10.0.0.2 Addr-family IPv4/unicast(afi/safi) not enabled. Ignore the Long-lived Graceful Restart capability
2024-09-24 04:30:21.292 [DEBG] bgpd: [NVZPF-5ST3B] 10.0.0.2 rcvd OPEN w/ optional parameter type 2 (Capability) len 23
2024-09-24 04:30:21.292 [DEBG] bgpd: [SCW43-WN4M1] 10.0.0.2 OPEN has Software Version capability (75), length 21
2024-09-24 04:30:21.292 [DEBG] bgpd: [N5H8N-2BZ13] 10.0.0.2 received Software Version: FRRouting/10.1.1_git
2024-09-24 04:30:21.292 [DEBG] bgpd: [ZWCSR-M7FG9] 10.0.0.2 [FSM] Receive_OPEN_message (OpenSent->OpenConfirm), fd 28
2024-09-24 04:30:21.292 [DEBG] bgpd: [T91AW-FGMHW] bgp_fsm_change_status : vrf default(0), Status: OpenConfirm established_peers 0
2024-09-24 04:30:21.292 [DEBG] bgpd: [HKWM3-ZC5QP] 10.0.0.2 fd 28 went from OpenSent to OpenConfirm
2024-09-24 04:30:21.293 [DEBG] bgpd: [WNM1E-D314G] 10.0.0.2 rcv OPEN, version 4, remote-as (in open) 65002, holdtime 9, id 10.0.0.2
2024-09-24 04:30:21.293 [INFO] bgpd: [HZN6M-XRM1G] %NOTIFICATION: sent to neighbor 10.0.0.2 6/7 (Cease/Connection Collision Resolution) 0 bytes
2024-09-24 04:30:21.293 [ERR!] bgpd: [MVZKX-EG443][EC 33554452] bgp_process_packet: BGP OPEN receipt failed for peer: 10.0.0.2
2024-09-24 04:30:21.293 [DEBG] bgpd: [ZWCSR-M7FG9] 10.0.0.2 [FSM] BGP_Stop (OpenSent->Idle), fd 26
2024-09-24 04:30:21.294 [DEBG] bgpd: [T91AW-FGMHW] bgp_fsm_change_status : vrf default(0), Status: Idle established_peers 0
2024-09-24 04:30:21.294 [DEBG] bgpd: [HKWM3-ZC5QP] 10.0.0.2 fd -1 went from OpenSent to Idle
2024-09-24 04:30:21.294 [DEBG] bgpd: [ZWCSR-M7FG9] 10.0.0.2 [FSM] Receive_KEEPALIVE_message (OpenConfirm->Established), fd 28
2024-09-24 04:30:21.294 [DEBG] bgpd: [X9KQ0-V0CBB] 10.0.0.2: peer transfer 0x7b28e8ba22b0 fd 28 -> 0x7b28e8b9dac0 fd -1)
2024-09-24 04:30:21.296 [DEBG] bgpd: [WNKP5-SN018] Found existing bnc 10.0.0.2/32(0)(VRF default) flags 0xb ifindex 0 #paths 0 peer 0x7b28e8b9dac0
2024-09-24 04:30:21.296 [DEBG] bgpd: [T91AW-FGMHW] bgp_fsm_change_status : vrf default(0), Status: Established established_peers 1
2024-09-24 04:30:21.296 [DEBG] bgpd: [HKWM3-ZC5QP] 10.0.0.2 fd 28 went from OpenConfirm to Established
2024-09-24 04:30:21.296 [INFO] bgpd: [N9HHH-F8H1M] %ADJCHANGE: neighbor 10.0.0.2(r2) in vrf default Up
2024-09-24 04:30:21.296 [DEBG] bgpd: [V4R0W-D4WGF] 10.0.0.2(Unknown) Update Group Hash: sort: 1 sub_sort: 0 UpdGrpFlags: 0 UpdGrpAFFlags: 553648135
2024-09-24 04:30:21.296 [DEBG] bgpd: [NVVBY-K8MCE] 10.0.0.2(Unknown) Update Group Hash: addpath: 4 UpdGrpCapFlag: 256 UpdGrpCapAFFlag: 2048 route_adv: 0 change local as: 0, as_path_loop_detection: 0
2024-09-24 04:30:21.296 [DEBG] bgpd: [X4CQ0-63QKB] 10.0.0.2(Unknown) Update Group Hash: addpath paths-limit: (send 0, receive 0)
2024-09-24 04:30:21.296 [DEBG] bgpd: [Z8Q37-65KK3] 10.0.0.2(Unknown) Update Group Hash: max packet size: 65535 pmax_out: 0 Peer Group: (NONE) rmap out: (NONE)
2024-09-24 04:30:21.296 [DEBG] bgpd: [SM2F3-HRYKP] 10.0.0.2(Unknown) Update Group Hash: dlist out: (NONE) plist out: (NONE) aslist out: (NONE) usmap out: (NONE) advmap: (NONE) 0
2024-09-24 04:30:21.296 [DEBG] bgpd: [V8B3M-T6VFC] 10.0.0.2(Unknown) Update Group Hash: default rmap: (NONE) shared network and afi active network: 0
2024-09-24 04:30:21.296 [DEBG] bgpd: [Y5EX3-GHT5V] 10.0.0.2(Unknown) Update Group Hash: Lonesoul: 0 ORF prefix: 0 max prefix out: 0
2024-09-24 04:30:21.296 [DEBG] bgpd: [X19K7-9V4K2] 10.0.0.2(Unknown) Update Group Hash: local role: 255 AIGP: 0 SOO: (NONE)
2024-09-24 04:30:21.296 [DEBG] bgpd: [SQ314-QBJCR] 10.0.0.2(Unknown) Update Group Hash key: 241883886
2024-09-24 04:30:21.296 [DEBG] bgpd: [GBN7X-KK0W4] [Event] Deleting stub connection for peer 10.0.0.2
2024-09-24 04:30:21.297 [DEBG] bgpd: [T91AW-FGMHW] bgp_fsm_change_status : vrf default(0), Status: Deleted established_peers 1
2024-09-24 04:30:21.297 [DEBG] bgpd: [HKWM3-ZC5QP] 10.0.0.2 fd -1 went from Idle to Deleted
2024-09-24 04:30:21.298 [DEBG] bgpd: [P3D3N-3277A] 10.0.0.2 [FSM] Timer (routeadv timer expire)
2024-09-24 04:30:22.394 [DEBG] bgpd: [N1KDM-HR02D] bgp_find_or_add_nexthop(10.0.0.2/32): prefix loops through itself
2024-09-24 04:30:22.394 [DEBG] bgpd: [VKMV1-4Y773] bgp_update(10.0.0.2): NH unresolved
2024-09-24 04:30:22.394 [DEBG] bgpd: [WNKP5-SN018] Found existing bnc 10.0.0.2/32(0)(VRF default) flags 0xb ifindex 0 #paths 0 peer 0x7b28e8b9dac0
2024-09-24 04:30:22.394 [DEBG] bgpd: [WT375-N4KPV] EOR REQ 0, EOR RCV 0
2024-09-24 04:30:22.394 [INFO] bgpd: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 10.0.0.2 in vrf default
2024-09-24 04:30:22.396 [DEBG] bgpd: [P3D3N-3277A] 10.0.0.2 [FSM] Timer (routeadv timer expire)
2024-09-24 04:30:22.396 [DEBG] bgpd: [ZP3RE-J4Q8C] send End-of-RIB for IPv4 Unicast to 10.0.0.2
r1#
snuffy22 commented 3 weeks ago

Doing a little bit of sleuthing, on the messages output via debug. Looks like it might be related to this: https://github.com/FRRouting/frr/pull/8956

Since this was backported to 8.0, and last known working is 7.5.1

ton31337 commented 3 weeks ago

Can I have an explanation in what case this is needed (route to self)?

r1# sh ip bgp 10.0.0.2
BGP routing table entry for 10.0.0.2/32, version 0
Paths: (1 available, no best path)
  Not advertised to any peer
  Local
    10.0.0.2(r2) (inaccessible, import-check enabled) from r2(10.0.0.2) (10.0.0.2)
      Origin IGP, metric 0, localpref 100, invalid, internal
      Last update: Thu Sep 19 12:27:22 2024
r1#
snuffy22 commented 3 weeks ago

If we observe many other routing OS such as Cisco NXOS and IOS, Arista cEOS, Nokia SR all of these allow such configurations. As noted before, 7.5 and earlier allowed this to happen.

The nexthop being itself should be allowed, as it has already established a BGP peer link to the device. (because of our other routing protocol/static route providing reachability to the address/peer in question)

As I do not claim to fully understand the code, it feels like another factor such as if address is a ibgp peer we should allow this self referential treatment (where to reach 10.0.0.2 we go via 10.0.0.2).

We also see that our other route sent from the same r2 device works fine and is added as a valid best route

r1# sh ip bgp 10.0.0.5
BGP routing table entry for 10.0.0.5/32, version 2
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  Local
    10.0.0.2(r2) from r2(10.0.0.2) (10.0.0.2)
      Origin IGP, metric 0, localpref 100, valid, internal, bestpath-from-AS Local, best (First path received)
      Last update: Thu Sep 19 12:28:47 2024
r1#

Currently the only way on FRR 8.0 and above i can get the 10.0.0.2 route to be considered valid is via adding a redistribute of the static route for 10.0.0.2.

ipspace commented 3 weeks ago

Can I have an explanation in what case this is needed (route to self)?

Please note that 10.0.0.2 (IBGP peer) is usually accessible via some other protocol (IGP), so it's not "route to self" when you look at the bigger picture.

Having the same route accessible via IGP makes this a bit less painful (within an AS), but the true problem is that the loopbacks are then not advertised via EBGP, which breaks anything that uses a combination of IBGP + EBGP with EBGP not changing the next hops, so the loopbacks remain the next hops.

Multi-pod VXLAN with intra-pod IBGP and inter-pod EBGP comes to mind (I would call it multi-site, but then some people use "multi-site" to imply VXLAN-to-VXLAN gateway at the site edge), as does Inter-AS MPLS/VPN Option C.

Hope this helps, Ivan

r1# sh ip bgp 10.0.0.2
BGP routing table entry for 10.0.0.2/32, version 0
Paths: (1 available, no best path)
  Not advertised to any peer
  Local
    10.0.0.2(r2) (inaccessible, import-check enabled) from r2(10.0.0.2) (10.0.0.2)
      Origin IGP, metric 0, localpref 100, invalid, internal
      Last update: Thu Sep 19 12:27:22 2024
r1#
ton31337 commented 3 weeks ago

We might relax this by checking if this route comes from IGP, not from eBGP, because now it's only relaxed if a static route exists.

ipspace commented 3 weeks ago

... if this route comes from IGP, not from eBGP, because now it's only relaxed if a static route exists.

s/eBGP/iBGP/g

Doing this check in EBGP makes way more sense as we usually don't run IGP together with EBGP (unless you're doing the ancient EBGP multihop as load balancing trick).

In any case, it would be great to have a generic "do we have a better route in the routing table or is this another attempt by mr. Munchausen to pull himself out of the quagmire" check.

ton31337 commented 3 weeks ago

@ipspace so if we receive a prefix e.g. 10.0.0.1/32 via 10.0.0.1, and this is via iBGP, then we should keep it as valid, right?

ipspace commented 3 weeks ago

@ipspace so if we receive a prefix e.g. 10.0.0.1/32 via 10.0.0.1, and this is via iBGP, then we should keep it as valid, right?

That's what everyone else is doing, and it's working fine in most scenarios as the IBGP route usually has lower preference (higher admin distance) than anything else, so if you already have a session with the peer, the route he's announcing won't kill your routing.

The only exception is the crazy scenario where you run IBGP between loopbacks advertised by EBGP (for EVPN vendors that cannot implement EVPN-over-EBGP properly). In that case, the IBGP route is better than the EBGP route due to shorter AS-path, becomes the best path, and of course you end in the recursive routing territory.

Not sure it's worth checking for that scenario. If someone wraps so much rope around their neck and then starts jumping at the edge of the cliff, maybe they have to learn their lesson the hard way ;)

ton31337 commented 2 weeks ago

Any thoughts if we could control this by no bgp network import-check? If import-check is turned off, this "prefix self" behavior could be ignored.

ipspace commented 2 weeks ago

Any thoughts if we could control this by no bgp network import-check? If import-check is turned off, this "prefix self" behavior could be ignored.

This is an edge case, and having a nerd knob to enable it is perfectly fine. Can't say whether it would make more sense to reuse an existing nerd knob or not as I haven't figured out what 'network import-check' does (yet)