FRRouting / frr

The FRRouting Protocol Suite
https://frrouting.org/
Other
3.16k stars 1.22k forks source link

OSPF routes won't install to RIB if nexthop is via a static route overlapping a connected route #12961

Open DrDeke opened 1 year ago

DrDeke commented 1 year ago

Describe the bug

When a route learned through OSPF has a next-hop IP address that is reachable through a static route that overlaps a connected route, frr marks the route as 'inactive' and does not include it in the RIB.

(The reason I have static routes overlapping a connected route is that I want to advertise only a subset of the IP network reachable on the connected route to some BGP peers.)

To Reproduce

  1. Start zebra/bgpd/ospfd/staticd with the following configuration:
!
frr version 8.4.2
frr defaults traditional
hostname kommissar
log syslog informational
service integrated-vtysh-config
!
ip route 1.2.3.224/27 eth0.2
ip route 1.2.3.208/28 eth0.2
ip route 1.2.3.200/29 eth0.2
ip route 1.2.3.196/30 eth0.2
ip route 1.2.3.194/31 eth0.2
!
interface wg2
 ip ospf passive
exit
!
interface wg3
 ip ospf passive
exit
!
router bgp 64513
 neighbor neighbor1 peer-group
 neighbor neighbor1 remote-as 64563
 neighbor neighbor2 peer-group
 neighbor neighbor2 remote-as 65433
 neighbor neighbor1_ip_addr peer-group neighbor1
 neighbor neighbor1_ip_addr update-source 172.23.255.45
 neighbor neighbor2_ip_addr peer-group neighbor2
 neighbor neighbor2_ip_addr update-source 172.31.254.253
 !
 address-family ipv4 unicast
  network 10.1.0.0/24
  network 1.2.3.194/31
  network 1.2.3.196/30
  network 1.2.3.200/29
  network 1.2.3.208/28
  network 1.2.3.224/27
  redistribute connected
  redistribute ospf
  neighbor neighbor1_ip_addr prefix-list bgp-mesh-infilter in
  neighbor neighbor1_ip_addr prefix-list bgp-mesh-outfilter out
  neighbor neighbor2_ip_addr prefix-list bgp-mesh-infilter in
  neighbor neighbor2_ip_addr prefix-list bgp-mesh-outfilter out
 exit-address-family
exit
!
router ospf
 ospf router-id 1.2.3.193
 redistribute bgp
 network 10.1.0.0/24 area 0.0.0.0
 network 10.1.1.8/30 area 0.0.0.0
 network 10.1.1.12/30 area 0.0.0.0
 network 10.1.2.0/24 area 0.0.0.0
 network 1.2.3.192/26 area 0.0.0.0
 network neighbor1_ip_addr/31 area 0.0.0.0
 network neighbor2_ip_addr/31 area 0.0.0.0
exit
!
ip prefix-list bgp-mesh-infilter seq 10 permit 0.0.0.0/0 ge 16
ip prefix-list bgp-mesh-outfilter seq 10 permit 10.1.0.0/24
ip prefix-list bgp-mesh-outfilter seq 20 permit 10.1.6.0/24
ip prefix-list bgp-mesh-outfilter seq 30 permit 10.1.1.12/30
ip prefix-list bgp-mesh-outfilter seq 1000 permit 1.2.3.194/31
ip prefix-list bgp-mesh-outfilter seq 1001 permit 1.2.3.196/30
ip prefix-list bgp-mesh-outfilter seq 1002 permit 1.2.3.200/29
ip prefix-list bgp-mesh-outfilter seq 1003 permit 1.2.3.208/28
ip prefix-list bgp-mesh-outfilter seq 1004 permit 1.2.3.224/27
!
end
  1. Have a router at 1.2.3.203 running OSPF and advertising a route to 10.1.8.1/32 with nexthop=1.2.3.203

Expected behavior

I would expect to see a route to 10.1.8.1/32 via 1.2.3.203 in the output of show ip route and added to the host's routing table (RIB). Instead, the route to 10.1.8.1/32 does not appear in the host's routing table, and appears as inactive in the output of show ip route or show ip route ospf as follows:

O 10.1.8.1/32 [110/4] via 1.2.3.203, eth0.2 inactive, weight 1, 00:15:20

When bringing up the OSPF neighbor with debug zebra rib detail turned on on frr, you get the following output which seems relevant to the 10.1.8.1/32 route:

2023-03-06 17:04:00.366 [DEBG] zebra: [MFYWV-KH3MC] process_subq_early_route_add: (0:?):10.1.8.1/32: Inserting route rn 0x558742686760, re 0x5587426836c0 (ospf) existing 0x0, same_count 0
2023-03-06 17:04:00.366 [DEBG] zebra: [Q4T2G-E2SQF] process_subq_early_route_add: dumping RE entry 0x5587426836c0 for 10.1.8.1/32 vrf default(0)
2023-03-06 17:04:00.366 [DEBG] zebra: [M5M58-9PD2R] 10.1.8.1/32: uptime == 686247, type == 6, instance == 0, table == 254
2023-03-06 17:04:00.366 [DEBG] zebra: [RVZMM-N7DME] 10.1.8.1/32: metric == 4, mtu == 0, distance == 110, flags == None status == None
2023-03-06 17:04:00.366 [DEBG] zebra: [Q1NW5-NWY7P] 10.1.8.1/32: nexthop_num == 1, nexthop_active_num == 0
2023-03-06 17:04:00.366 [DEBG] zebra: [ZSB1Z-XM2V3] 10.1.8.1/32: NH 1.2.3.203[6] vrf default(0) wgt 1, with flags
2023-03-06 17:04:00.366 [DEBG] zebra: [SCETK-GQ9E4] 10.1.8.1/32: dump complete
2023-03-06 17:04:00.366 [DEBG] zebra: [GCGMT-SQR82] rib_link: (0:?):10.1.8.1/32: rn 0x558742686760 adding dest
2023-03-06 17:04:00.366 [DEBG] zebra: [JF0K0-DVHWH] rib_meta_queue_add: (0:254):10.1.8.1/32: queued rn 0x558742686760 into sub-queue RIP/OSPF/ISIS/EIGRP/NHRP Routes

2023-03-06 17:04:00.367 [DEBG] zebra: [NZNZ4-7P54Y] default(0:254):10.1.8.1/32: Processing rn 0x558742686760
2023-03-06 17:04:00.367 [DEBG] zebra: [ZJVZ4-XEGPF] default(0:254):10.1.8.1/32: Examine re 0x5587426836c0 (ospf) status: Changed flags: None dist 110 metric 4
2023-03-06 17:04:00.367 [DEBG] zebra: [M7EN1-55BTH]         nexthop_active: Route Type ospf has not turned on recursion
2023-03-06 17:04:00.367 [DEBG] zebra: [HJ48M-MB610]         nexthop_active_check: Unable to find active nexthop
2023-03-06 17:04:00.367 [DEBG] zebra: [JPJF4-TGCY5] default(0:0):10.1.8.1/32: After processing: old_selected 0x0 new_selected 0x0 old_fib 0x0 new_fib 0x0
2023-03-06 17:04:00.367 [DEBG] zebra: [HH6N2-PDCJS] default(0:254):10.1.8.1/32 rn 0x558742686760 dequeued from sub-queue RIP/OSPF/ISIS/EIGRP/NHRP Routes

Screenshots

kommissar# show ip route ospf
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

O   10.1.0.0/24 [110/3] is directly connected, eth0.2, weight 1, 00:25:30
O   10.1.1.8/30 [110/36] via 1.2.3.203, eth0.2 inactive, weight 1, 00:25:28
O   10.1.1.12/30 [110/10] is directly connected, wg0, weight 1, 00:25:29
O   10.1.1.28/30 [110/1003] via 1.2.3.203, eth0.2 inactive, weight 1, 00:25:28
O   10.1.1.40/30 [110/1003] via 1.2.3.203, eth0.2 inactive, weight 1, 00:25:28
O   10.1.2.0/24 [110/3] is directly connected, eth0.5, weight 1, 00:25:29
O   10.1.4.2/32 [110/4] via 1.2.3.203, eth0.2 inactive, weight 1, 00:25:28
O>* 10.1.6.0/24 [110/11] via 10.1.1.14, wg0, weight 1, 00:25:16
O   10.1.8.1/32 [110/4] via 1.2.3.203, eth0.2 inactive, weight 1, 00:25:28
O   10.7.0.1/32 [110/4] via 1.2.3.203, eth0.2 inactive, weight 1, 00:25:28
O   10.7.1.1/32 [110/4] via 1.2.3.203, eth0.2 inactive, weight 1, 00:25:28
O   1.2.3.192/26 [110/3] is directly connected, eth0.2, weight 1, 00:25:29
O   172.23.255.44/31 [110/10] is directly connected, wg2, weight 1, 00:25:29
O   172.31.254.252/31 [110/10] is directly connected, wg3, weight 1, 00:25:29

Versions

Additional context

I am not sure whether this is a bug or whether I am misunderstanding how this should work. Any insight will be appreciated, and please let me know if any additional debugging output would be helpful.

Note: The IP network 1.2.3.0/24 is not the actual IPv4 network in the configuration; I have replaced the first three octets with '1.2.3' for privacy reasons in all parts of this report.

github-actions[bot] commented 11 months ago

This issue is stale because it has been open 180 days with no activity. Comment or remove the autoclose label in order to avoid having this issue closed.

frrbot[bot] commented 11 months ago

This issue will be automatically closed in the specified period unless there is further activity.

DrDeke commented 11 months ago

@eqvinox Any ideas on this?

frrbot[bot] commented 11 months ago

This issue will no longer be automatically closed.