FRRouting / frr

The FRRouting Protocol Suite
https://frrouting.org/
Other
3.35k stars 1.25k forks source link

FRR 10.0.1 route maps filtering prefixes that should be sent #16367

Open fedepaol opened 4 months ago

fedepaol commented 4 months ago

Description

This is happening in https://github.com/metallb/frr-k8s/pull/166 , where I am bumping the version used in frr-k8s from 9.1 to 10.0.1. The FRR-K8s CI is quite stable, while after the bump I am seeing a few flakes.

172.18.0.2 tries to announce 192.168.0.0/24 and 172.18.0.5 is not receiving it

I am going to attach the full configuration, but adding here the relevant snippets:

Neighbor route map:

address-family ipv4 unicast
  network 192.168.0.0/24
  network 192.168.1.0/24
  network 192.168.2.0/24
  network 192.168.3.0/24
  neighbor 172.18.0.5 activate
  neighbor 172.18.0.5 route-map 172.18.0.5-in in
  neighbor 172.18.0.5 route-map 172.18.0.5-out out

Prefix lists:

ip prefix-list 172.18.0.5-100-ipv4-localpref-prefixes seq 1 permit 192.168.0.0/24
ip prefix-list 172.18.0.5-100-ipv4-localpref-prefixes seq 2 permit 192.168.1.0/24
ip prefix-list 172.18.0.5-pl-ipv4 seq 1 permit 192.168.0.0/24
ip prefix-list 172.18.0.5-pl-ipv4 seq 2 permit 192.168.1.0/24
ip prefix-list 172.18.0.5-pl-ipv4 seq 3 permit 192.168.2.0/24
ip prefix-list 172.18.0.5-pl-ipv4 seq 4 permit 192.168.3.0/24
ip prefix-list 172.18.0.5-150-ipv4-localpref-prefixes seq 1 permit 192.168.2.0/24
ip prefix-list 172.18.0.5-inpl-ipv4 seq 1 deny any
ip prefix-list 172.18.0.5-denied-inpl-ipv4 seq 1 permit 192.167.9.0/24 le 32

Route maps:


route-map 172.18.0.5-out permit 1
 match ip address prefix-list 172.18.0.5-100-ipv4-localpref-prefixes
 on-match next
 set local-preference 100
exit
!
route-map 172.18.0.5-out permit 2
 match ip address prefix-list 172.18.0.5-100-ipv4-localpref-prefixes
 on-match next
 set local-preference 100
exit
!
route-map 172.18.0.5-out permit 3
 match ip address prefix-list 172.18.0.5-150-ipv4-localpref-prefixes
 on-match next
 set local-preference 150
exit
!
route-map 172.18.0.5-out permit 4
 match ip address prefix-list 172.18.0.5-pl-ipv4
exit
!
route-map 172.18.0.5-out permit 5
 match ipv6 address prefix-list 172.18.0.5-pl-ipv4
exit
!

Show bgp ipv4 on 172.18.0.2

Instance default:
BGP table version is 4, local router ID is 172.18.0.2, vrf id 0
Default local pref 100, local AS 64512
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

    Network          Next Hop            Metric LocPrf Weight Path
 *> 192.168.0.0/24   0.0.0.0                  0         32768 i
 *> 192.168.1.0/24   0.0.0.0                  0         32768 i
 *> 192.168.2.0/24   0.0.0.0                  0         32768 i
 *> 192.168.3.0/24   0.0.0.0                  0         32768 i

Show bgp ipv4 on 172.18.0.5:

Dumping show bgp ip for ebgp-single-hop, local addresses: ipv4 - 172.18.0.5, ipv6 - fc00:f853:ccd:e793::5

Instance default:
BGP table version is 20, local router ID is 172.18.0.5, vrf id 0
Default local pref 100, local AS 4200000000
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

    Network          Next Hop            Metric LocPrf Weight Path
 *= 192.168.0.0/24   172.18.0.4               0             0 64512 i
 *>                  172.18.0.3               0             0 64512 i
 *= 192.168.1.0/24   172.18.0.4               0             0 64512 i
 *>                  172.18.0.3               0             0 64512 i
 *= 192.168.2.0/24   172.18.0.4               0             0 64512 i
 *>                  172.18.0.3               0             0 64512 i
 *= 192.168.3.0/24   172.18.0.4               0             0 64512 i
 *>                  172.18.0.3               0             0 64512 i

(as we can see some other instances were able to push the routes.

The logs say the prefix is being filtered by the route-map:


2024/07/11 11:05:26.147 BGP: [WJC21-WPGZF] u171:s171 announcing routes to 172.18.0.5, combined into 1 peers
2024/07/11 11:05:26.147 BGP: [T5JFA-13199] subgroup_process_announce_selected: p=192.168.0.0/24, selected=0x7fb35fffd0b0
2024/07/11 11:05:26.147 BGP: [VERY4-P6JC8] 172.18.0.5(Unknown) [Update:SEND] 192.168.0.0/24 is filtered by route-map '172.18.0.5-out'
2024/07/11 11:05:26.147 BGP: [T5JFA-13199] subgroup_process_announce_selected: p=192.168.1.0/24, selected=0x7fb35fffd1f0
2024/07/11 11:05:26.147 BGP: [VERY4-P6JC8] 172.18.0.5(Unknown) [Update:SEND] 192.168.1.0/24 is filtered by route-map '172.18.0.5-out'
2024/07/11 11:05:26.147 BGP: [T5JFA-13199] subgroup_process_announce_selected: p=192.168.2.0/24, selected=0x7fb35fffd330
2024/07/11 11:05:26.147 BGP: [VERY4-P6JC8] 172.18.0.5(Unknown) [Update:SEND] 192.168.2.0/24 is filtered by route-map '172.18.0.5-out'
2024/07/11 11:05:26.147 BGP: [T5JFA-13199] subgroup_process_announce_selected: p=192.168.3.0/24, selected=0x7fb35fffd470
2024/07/11 11:05:26.147 BGP: [VERY4-P6JC8] 172.18.0.5(Unknown) [Update:SEND] 192.168.3.0/24 is filtered by route-map '172.18.0.5-out'
2024/07/11 11:05:26.149 BGP: [WJC21-WPGZF] u174:s174 announcing routes to 172.30.0.3, combined into 1 peers

Version

I am experiencing this using the images quay.io/frrouting/frr:10.0.1 and quay.io/frrouting/frr:master

How to reproduce

It's a flake, the same test sometimes passes and some other time it does not. I have the feeling that it is associated to prefixlist with community / local preferences. On the other hand, it happens often enough to be able to provide the right dump commands after a test fails.

Expected behavior

The neighbor should receive the prefixes

Actual behavior

The instance running on the cluster is not advertising the prefixes.

Additional context

No response

Checklist

fedepaol commented 4 months ago

Full running config of 172.18.0.5 (the receiver)

####### Show running config
Building configuration...

Current configuration:
!
frr version 10.0.1_git
frr defaults traditional
hostname cd1e47ac22ac
log file /tmp/frr.log
log timestamp precision 3
!
debug zebra nht
debug bgp neighbor-events
debug bgp nht
debug bgp updates in
debug bgp updates out
debug bfd peer
!
password zebra
!
router bgp 4200000000
 bgp router-id 172.18.0.5
 no bgp ebgp-requires-policy
 no bgp default ipv4-unicast
 no bgp network import-check
 neighbor 172.18.0.2 remote-as 64512
 neighbor 172.18.0.3 remote-as 64512
 neighbor 172.18.0.4 remote-as 64512
 !
 address-family ipv4 unicast
  neighbor 172.18.0.2 activate
  neighbor 172.18.0.2 next-hop-self
  neighbor 172.18.0.3 activate
  neighbor 172.18.0.3 next-hop-self
  neighbor 172.18.0.4 activate
  neighbor 172.18.0.4 next-hop-self
 exit-address-family
exit
!
route-map RMAP permit 10
 set ipv6 next-hop prefer-global
exit
!
end
fedepaol commented 4 months ago

Running config for the instance running on the cluster

runningconfig.txt

fedepaol commented 4 months ago

logs of the instance running on the cluster frrlogs.txt

ton31337 commented 3 months ago
route-map 172.18.0.5-out permit 1
 match ip address prefix-list 172.18.0.5-100-ipv4-localpref-prefixes
 on-match next
 set local-preference 100
exit
!
route-map 172.18.0.5-out permit 2
 match ip address prefix-list 172.18.0.5-100-ipv4-localpref-prefixes
 on-match next
 set local-preference 100
exit

Two identical match/set with different sequence numbers?

Btw, could you please enable debug route-map detail? It would at least tell what's going on a bit more.

Also, I'm trying to understand the use case here with on-match next. Could you drop it and test without on-match next, because you don't use it at all (as it should be).

fedepaol commented 3 months ago
route-map 172.18.0.5-out permit 1
 match ip address prefix-list 172.18.0.5-100-ipv4-localpref-prefixes
 on-match next
 set local-preference 100
exit
!
route-map 172.18.0.5-out permit 2
 match ip address prefix-list 172.18.0.5-100-ipv4-localpref-prefixes
 on-match next
 set local-preference 100
exit

Two identical match/set with different sequence numbers?

Ugh, I guess we can remove them

Btw, could you please enable debug route-map detail? It would at least tell what's going on a bit more.

Yup, will do and re-run

Also, I'm trying to understand the use case here with on-match next. Could you drop it and test without on-match next, because you don't use it at all (as it should be).

The idea (maybe I am mis-using it) is to apply the local pref and then validate the ip later on to the following maps. I need it in case I want also to apply a community to the same prefixes

fedepaol commented 3 months ago

@ton31337 here we go:

2024/07/12 15:53:58.328 BGP: [T5JFA-13199] subgroup_process_announce_selected: p=192.168.0.0/24, selected=0x7f56a3a81330
2024/07/12 15:53:58.328 BGP: [RYF1Z-ZKDRS] route_match_address_prefix_list: Prefix List 172.30.0.2-100-ipv4-localpref-prefixes specified does not exist defaulting to NO_MATCH
2024/07/12 15:53:58.328 BGP: [RYF1Z-ZKDRS] route_match_address_prefix_list: Prefix List 172.30.0.2-10:100-ipv4-community-prefixes specified does not exist defaulting to NO_MATCH
2024/07/12 15:53:58.328 BGP: [RYF1Z-ZKDRS] route_match_address_prefix_list: Prefix List 172.30.0.2-150-ipv4-localpref-prefixes specified does not exist defaulting to NO_MATCH
2024/07/12 15:53:58.328 BGP: [RYF1Z-ZKDRS] route_match_address_prefix_list: Prefix List 172.30.0.2-10:101-ipv4-community-prefixes specified does not exist defaulting to NO_MATCH
2024/07/12 15:53:58.328 BGP: [RYF1Z-ZKDRS] route_match_address_prefix_list: Prefix List 172.30.0.2-pl-ipv4 specified does not exist defaulting to NO_MATCH
2024/07/12 15:53:58.328 BGP: [KZK58-6T4Y6] No best match sequence for pfx: 192.168.0.0/24 in route-map: 172.30.0.2-out, result: no match

Show running config and full log attached : showrunning.txt.gz

fulllog.txt.gz

fedepaol commented 3 months ago

I checked only the first, seem to be there, so I'd say it's not related to the redundant match (although it makes sense to remove it).

ton31337 commented 3 months ago

Is this happening during frr-reload, on boot, container restart?

fedepaol commented 3 months ago

Definitely during reload. It's a battery of e2e tests, where on each test we run a different configuration and do some assertions.

ton31337 commented 3 months ago

Could it be possible to have a full reload log? frr-reload.py --debug --reload /etc/frr/frr.conf. I'd like to understand what actually triggers this flake...

fedepaol commented 3 months ago

Here's another run:


This is the neighbor complaining about missing routes (172.18.0.5) 

Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

    Network          Next Hop            Metric LocPrf Weight Path
 *>i192.168.2.0/24   172.18.0.2               0    100      0 i
 *=i                 172.18.0.4               0    100      0 i

Displayed 1 routes and 2 total paths

This is 172.18.0.3:

Instance default:
BGP table version is 2, local router ID is 172.18.0.3, vrf id 0
Default local pref 100, local AS 64512
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

    Network          Next Hop            Metric LocPrf Weight Path
 *> 192.168.2.0/24   0.0.0.0                  0         32768 i
 *> 192.169.2.0/24   0.0.0.0                  0         32768 i

Displayed 2 routes and 2 total paths

ip prefix-list 172.18.0.5-pl-ipv4 seq 1 permit 192.168.2.0/24

Still have the filter

2024/07/23 15:29:13.553 BGP: [VERY4-P6JC8] 172.18.0.5(Unknown) [Update:SEND] 192.168.2.0/24 is filtered by route-map '172.18.0.5-out'
fedepaol commented 3 months ago

Last reload event: lastreload.txt.gz

fedepaol commented 3 months ago

Sequence of reloads in the last 10 minutes reload.txt.gz

Every reload sequence starts with Caught SIGHUP and acquired lock! Reloading FRR.. and ends with FRR reloaded successfully! 2 seconds

Every cycle, we do a frr-reload.py --test first and a frr-reload.py --overwrite later.

fedepaol commented 3 months ago

frr.zip

fedepaol commented 3 months ago

As discussed on slack, I attached the logs of the reloader and the generated configuration when running with 9.1 (the generated config is the same regardless).

ton31337 commented 3 months ago

First, this is the wrong sequence of operations:

ip prefix-list 172.30.0.2-pl-ipv4 seq 1 permit 192.168.2.0/24
ip prefix-list 172.30.0.2-pl-ipv4 seq 2 permit 192.169.2.0/24
route-map 172.30.0.2-out permit 1
  match ip address prefix-list 172.30.0.2-pl-ipv4
route-map 172.30.0.2-out permit 2
  match ipv6 address prefix-list 172.30.0.2-pl-ipv4

ipv6 prefix-list 172.30.0.2-pl-ipv4 seq 3 deny any <<<<<<<<<<<<<<<<<

ipv6 prefix-list ... should be created before route-map statements to avoid noisy logs like below.

Looking at the logs you provided, it's not clear which AFI triggered this output (I assume IPv6):

2024/07/12 15:53:58.328 BGP: [RYF1Z-ZKDRS] route_match_address_prefix_list: Prefix List 172.30.0.2-10:100-ipv4-community-prefixes specified does not exist defaulting to NO_MATCH

Could you move ipv6 prefix-list ... creation above route-maps and re-test. I'd like to get rid of any false positives.

Also, could you enable log commands to see what actual commands were executed and in which order.

fedepaol commented 1 month ago

Sorry for the extreme delay @ton31337 , here I am with the additional information:

Here we have 172.18.0.5 not getting routes from 172.18.0.2.

From the logs the usual:

2024/09/12 16:01:41.695 BGP: [WJC21-WPGZF] u109:s109 announcing routes to 172.18.0.5, combined into 1 peers
2024/09/12 16:01:41.695 BGP: [T5JFA-13199] subgroup_process_announce_selected: p=192.168.2.0/24, selected=0x7fb5d00e47d0
2024/09/12 16:01:41.695 BGP: [KZK58-6T4Y6] No best match sequence for pfx: 192.168.2.0/24 in route-map: 172.18.0.5-out, result: no match
2024/09/12 16:01:41.695 BGP: [H5AW4-JFYQC] Route-map: 172.18.0.5-out, prefix: 192.168.2.0/24, result: deny
2024/09/12 16:01:41.695 BGP: [VERY4-P6JC8] 172.18.0.5(Unknown) [Update:SEND] 192.168.2.0/24 is filtered by route-map '172.18.0.5-out'

I applied the requested changes (define the prefix lists before the maps and adding log commands in the configuration):

Attaching a tgz with:

Also, note that in a few reloads I am seeing

BGP is already running; AS is 64512
line 684: Failure to communicate[13] to bgpd, line: router bgp 64513 vrf red

line 685: Warning[4]...: early exit from config file
[2101|bgpd] Configuration file[/etc/frr/frr.conf] processing failure: 13
[2094|mgmtd] done
2024-09-12 15:57:51,557 WARNING: frr-reload.py failed due to
vtysh (exec file) exited with status 13

Not sure it's related or not. I am going to attach also a file with more reload events (reload_full.txt) which contains them too.

logs.zip