Open Per-Forma opened 1 year ago
I wonder if this is related to disabling unicast ... can you try without that step?
Apologies for not getting back to you sooner on this. I had to take down my lab machines and move them so I got disrupted a bit.
I just removed the unicast disable command and then restarted the frr service.
No change in behavior so far. Prefixes show up when entering neighbor 192.168.62.1 activate
but not installed to the inet routing table. I don't know if it matters to your thinking process here, but although the unicast safi is enable it's not established in bgp as I don't have it turned on in the 5171 neighbor.
this looks like a bug ... clearing myself off in case someone wants to work on a fix
Can we see the following output from frr-boc please?
#show bgp ipv4 vpn detail-routes
I have seen something similar to this and it was due to the RTs changing when BGP came up - we solved it by manually applying a unique router-id on the BGP vrf process (router bgp 394211 vrf inet).
Sure thing! Here it is:
frr-boc# sh bgp ipv4 vpn detail-routes
BGP table version is 12, local router ID is 192.168.62.41, vrf id 0
Default local pref 100, local AS 394211
Route Distinguisher: 394211:42
BGP routing table entry for 394211:42:147.0.0.0/30, version 11
not allocated
Paths: (1 available, best #1)
Advertised to non peer-group peers:
192.168.62.1
11599
147.0.0.1 from 0.0.0.0 (192.168.62.41) vrf inet(5) announce-nh-self
Origin incomplete, metric 0, valid, sourced, local, best (First path received)
Extended Community: RT:394211:42
Originator: 192.168.62.41
Remote label: 80
Last update: Tue Nov 28 01:09:43 2023
BGP routing table entry for 394211:42:150.0.0.1/32, version 12
not allocated
Paths: (1 available, best #1)
Advertised to non peer-group peers:
192.168.62.1
11599
147.0.0.1 from 0.0.0.0 (192.168.62.41) vrf inet(5) announce-nh-self
Origin incomplete, metric 0, valid, sourced, local, best (First path received)
Extended Community: RT:394211:42
Originator: 192.168.62.41
Remote label: 80
Last update: Tue Nov 28 01:09:43 2023
BGP routing table entry for 394211:42:158.0.0.0/24, version 1
not allocated
Paths: (1 available, best #1)
Not advertised to any peer
Local
192.168.62.1 (metric 110) from 192.168.62.1 (192.168.62.1)
Origin incomplete, localpref 100, valid, internal, best (First path received)
Extended Community: RT:394211:42
Remote label: 68001
Last update: Mon Nov 27 23:38:44 2023
BGP routing table entry for 394211:42:171.0.0.0/24, version 2
not allocated
Paths: (1 available, best #1)
Not advertised to any peer
Local
192.168.62.2 (metric 120) from 192.168.62.1 (192.168.62.2)
Origin incomplete, localpref 100, valid, internal, best (First path received)
Extended Community: RT:394211:42
Originator: 192.168.62.2, Cluster list: 192.168.62.1
Remote label: 68008
Last update: Mon Nov 27 23:38:44 2023
Displayed 4 routes and 4 total paths
For some reason, I'm not seeing notifications from github or I would have responded sooner. I'll see if I can do something about that so I can respond a little quicker.
@Per-Forma thanks, from frr-boc can we see out from the following also:
#show run bgp
#show ip fib vrf inet
Here they are
frr-boc# sh run bgp
Building configuration...
Current configuration:
!
frr version 9.1-dev
frr defaults traditional
hostname frr-boc
log file /var/log/frr/bgpd.log
log syslog informational
service integrated-vtysh-config
!
router bgp 394211 vrf inet
neighbor 147.0.0.1 remote-as 11599
!
address-family ipv4 unicast
neighbor 147.0.0.1 soft-reconfiguration inbound
neighbor 147.0.0.1 route-map bgp-allow-all-map in
neighbor 147.0.0.1 route-map bgp-allow-all-map out
label vpn export auto
rd vpn export 394211:42
rt vpn both 394211:42
export vpn
import vpn
exit-address-family
exit
!
router bgp 394211
bgp router-id 192.168.62.41
neighbor 192.168.62.1 remote-as 394211
neighbor 192.168.62.1 update-source lo
!
address-family ipv4 vpn
neighbor 192.168.62.1 activate
neighbor 192.168.62.1 soft-reconfiguration inbound
exit-address-family
exit
!
ip prefix-list all-v4 seq 5 permit any
!
route-map bgp-allow-all-map permit 5
match ip address prefix-list all-v4
exit
!
end
frr-boc# sh ip fib vrf inet
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
VRF inet:
C>* 147.0.0.0/30 is directly connected, enp2s0, 16:46:39
B>* 150.0.0.1/32 [20/0] via 147.0.0.1, enp2s0, weight 1, 15:14:39
@Per-Forma thanks for the output - i cant see anything obvious from it, the only thing I can think to try is manually setting the RID on the VRF AF itself using the following (write & restart FRR after):
router bgp 394211 vrf inet
bgp router-id x.x.x.x
Hey @beith12, I thanks for looking at this with me. I set the RID in the bgp vrf stanza as you indicated. It's now set to bgp router-id 147.0.0.2
but that doesn't appear to have improved anything yet.
@Per-Forma OK - are any of the working devices with a VRF in this topology running 9.1-dev? If so it might be worth comparing everything side-by-side. I see that you mentioned you are trying to get IPv6 in an MPLS working - I assume (based on your configs) this over an IPv4 underlay? If so i have been testing earlier releases for this feature and have yet to achieve end-to-end.
@Per-Forma Should also mention that 9.1 was released today so might be worth trying that rather than dev version.
@beith12 - for reference. This environment is in a lab topology. I initially configured it to test a 6VPE (IPv6 VPN over IPv4 MPLS backbone). We have a version of this in out production environment, and the IPv4 vrf routes are working correctly there. Our production envronment is running the frr 8.1 package that is packaged by canonical. In that environment, I'm seeing an issue with the v6 routes in the VRF. My original intention here was to build frr from source, replicate the issue I'm seeing in production and then work on resolving it. However, I've been stuck on this IPv4 route problem, which is working fine in production.
In this lab environment, I am running a 9.1 dev version. I linked the commit in the original issue, which is from June. I haven't changed that, as I know how frustrating it can be when things get changed during a troubleshooting exercise!
I can work on building a new version and testing this on it if you think that would be the best use of time.
@Per-Forma I would rebuild (the faulty node at least) with the 9.1 version that was released yesterday to compare.
Thanks
Hey @beith12, got this done, but I'm seeing much the same results. However, I did notice a couple of odd warnings in the status output for the frr.service. See below. I notice that there are two of these, which matches the number of routes in this test environment that I'm missing. Given that the message references the SRGB, I think this must be related. The part I don't understand is that the SID index numbers being given don't match up with a valid mpls label as they are out of range for a valid label.
Dec 04 19:45:36 frr-boc frrinit.sh[9698]: * Started watchfrr
Dec 04 19:45:36 frr-boc watchfrr[9709]: [KWE5Q-QNGFC] all daemons up, doing startup-complete notify
Dec 04 19:45:36 frr-boc systemd[1]: Started FRRouting.
Dec 04 19:45:37 frr-boc zebra[9722]: [V98V0-MTWPF] client 54 says hello and bids fair to announce only bgp routes vrf=0
Dec 04 19:45:48 frr-boc ospfd[9737]: [S5PCG-77H23] Packet[DD]: Neighbor 192.168.62.1 Negotiation done (Master).
Dec 04 19:45:48 frr-boc ospfd[9737]: [XYQCD-TPKQT][EC 134217736] index2label: SID index 7168000 falls outside SRGB range
Dec 04 19:45:48 frr-boc ospfd[9737]: [G51Y1-54QJR][EC 134217744] Type-10 Opaque-LSA (opaque_type=8): Common origination for AREA(0.0.0.0) has already started
Dec 04 19:45:50 frr-boc ospfd[9737]: [XYQCD-TPKQT][EC 134217736] index2label: SID index 7168256 falls outside SRGB range
Dec 04 19:45:51 frr-boc zebra[9722]: [WPPMZ-G9797] if_zebra_speed_update: enp2s0 old speed: 0 new speed: 1000
Dec 04 19:45:59 frr-boc bgpd[9730]: [JG0WZ-7X009][EC 33554504] 192.168.62.1 unrecognized capability code: 128 - ignored
@Per-Forma This may relate to the next hop of the BGP route (announced by OSPF). Have you set the following anywhere segment-routing global-block xxxxxxxx
? If so this label range it is best to match on all devices. The SID index (defined by segment-routing prefix x.x.x.x/32 index
is added to the SRGB to create the transport label so double check the index is not a large number and the following is set high enough in Linux net.mpls.platform_labels
(use sudo sysctl -a --pattern mpls
to see the labels)
I am facing the similar issue for ipv6 route. here I have received the ipv6 route in bgp-update from peer , it shows up in the vpn-route listing "show bgp ipv6 vpn" o/p but it doesn't get added inside fib. I am using frr 8.4.4 version.
frr-84895958c-m6cf6# show bgp ipv6 vpn rd 100:999 BGP table version is 16, local router ID is 192.150.164.230, vrf id 0 Default local pref 100, local AS 65300 Status codes: s suppressed, d damped, h history, * valid, > best, = multipath, i internal, r RIB-failure, S Stale, R Removed Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self Origin codes: i - IGP, e - EGP, ? - incomplete RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path Route Distinguisher: 100:999
2405:70:70:70::70/128 2405:192:150:164::115 0 0 65400 ? *> 2405:192:150:164::115 0 0 65400 ?
frr-84895958c-m6cf6# show ipv6 route vrf VRF1 .> here no o/p comes. the above route is not getting in to fib here.
any idea ?
Describe the bug
To Reproduce
Configure OSPF with segment routing for label distribution, and ibgp in the default VRF. Enable ipv4 VPN address family and disable unicast. Create VRF 'inet' on
frr-boc
and internal routers, placing an interface in the vrf on each router. Configure route-distinguisher and route-targe to import/export '394211:42' to the inet vrf. See network diagram in screenshots.When testing, the prefix (example: 147.0.0.0/30) (connected to
frr-boc
) is being advertised byfrr-boc
to hosts5171
and3928
with the 394211:42 route distinguisher and it is being installed to those hosts. A prefix originating on the5171
or3928
(example: 171.0.0.0/24) and being advertised tofrr-boc
is not getting installed to the routing table. When viewing the output ofshow bgp ipv4 vpn rd 394211:42
onfrr-boc
the prefixs appear and indicate valid and best. However, they do not appear to be getting installed into the forwarding table. See the screenshot for example.I've included the config of frr-boc below as a txt file. Of note, the 147.0.0.0/30 is being picked up by the
5171
and3928
and being installed into the correct vrf table. Additionally, the two prefixes being advertised by the5171
and3928
are being installed into one-another's inet vrf table.Expected behavior
Expected behavior would be that the route installs correctly or an error/reason is indicated for it not being installed.
Screenshots
Versions
Additional context
I'm building this in a lab environment with the original intention of using the current master branch to see if I can replicate an issue we are seeing in a production environment for ipv6 routes in the vrf being rejected. However, I'm stuck here, unable to see what I have done in this lab that's causing these ipv4 routes be unable to install into the VRF table. This issue is different from the v6 rejected issue as I'm not seeing the
rejected
message. I'm unsure if this is something I've done wrong or not!FRR-BOC lo: 192.168.62.41 5171 lb1: 192.168.62.1 3928 lb1: 192.168.62.2
frr-boc-config.txt