FRRouting / frr

The FRRouting Protocol Suite
https://frrouting.org/
Other
3.21k stars 1.24k forks source link

the mp-bgp will not update the routes to the vrf #7073

Closed hothotstreet closed 1 year ago

hothotstreet commented 4 years ago

The mpls vpn topology is as follows: CE-----PE-----PE------CE

question: Under normal circumstances, I can see the vrf route on the PE,like this,the route of 10.0.1.0/24 is learn for the other PE

localhost.localdomain# show ip route vrf vrf_20006_8 
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, r - rejected route

VRF vrf_20006_8:
K>* 0.0.0.0/0 [0/8192] unreachable (ICMP unreachable), 1d21h06m
O>* 10.0.0.0/24 [110/10010] via 10.251.0.9, tapgw.8, 00:00:22
B>* 10.0.1.0/24 [200/10010] via 172.16.17.10, ens192(vrf default), label 17, 00:00:01
O   10.251.0.0/16 [110/10000] is directly connected, tapgw.8, 00:37:05
C>* 10.251.0.0/16 is directly connected, tapgw.8, 00:37:05
C>* 10.254.0.0/16 is directly connected, tapgw_sp.8, 00:37:05
O   10.254.0.0/16 [110/10000] is directly connected, tapgw_sp.8, 1d21h05m

the ipv4 vpn routes:

localhost.localdomain# show bgp ipv4 vpn 
BGP table version is 26, local router ID is 10.254.128.31, vrf id 0
Default local pref 100, local AS 60030
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
Route Distinguisher: 8:8
*> 10.0.0.0/24      10.251.0.9@8<     10010         32768 ?
    UN=10.251.0.9 EC{8:8} label=17 type=bgp, subtype=5
*                   10.251.0.9@8<     10010         32768 ?
    UN=10.251.0.9 EC{8:8} label=17 type=bgp, subtype=5
*> 10.0.1.0/24      172.16.17.10     10010             0 60029 ?
    UN=172.16.17.10 EC{8:8} label=17 type=bgp, subtype=0

Now I manually close the PE interconnection interface, and then open it again. After a period of time, the ldp neighbors and BGP neighbors are all up, but I cannot learn the vrf of the opposite PE from the VRF route.

**VRF routes,now it is no 10.0.1.0/24**
localhost.localdomain# show ip route vrf vrf_20006_8 
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, r - rejected route

VRF vrf_20006_8:
K>* 0.0.0.0/0 [0/8192] unreachable (ICMP unreachable), 1d21h36m
O>* 10.0.0.0/24 [110/10010] via 10.251.0.9, tapgw.8, 00:30:14
O   10.251.0.0/16 [110/10000] is directly connected, tapgw.8, 01:06:57
C>* 10.251.0.0/16 is directly connected, tapgw.8, 01:06:57
C>* 10.254.0.0/16 is directly connected, tapgw_sp.8, 01:06:57
O   10.254.0.0/16 [110/10000] is directly connected, tapgw_sp.8, 1d21h35m

bgp ipv4 vpn routes:

localhost.localdomain# show bgp ipv4 vpn 
BGP table version is 26, local router ID is 10.254.128.31, vrf id 0
Default local pref 100, local AS 60030
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
Route Distinguisher: 8:8
*> 10.0.0.0/24      10.251.0.9@8<     10010         32768 ?
    UN=10.251.0.9 EC{8:8} label=17 type=bgp, subtype=5
*                   10.251.0.9@8<     10010         32768 ?
    UN=10.251.0.9 EC{8:8} label=17 type=bgp, subtype=5
*> 10.0.1.0/24      172.16.17.10     10010             0 60029 ?
    UN=172.16.17.10 EC{8:8} label=17 type=bgp, subtype=0

Displayed  2 routes and 3 total paths

if i delete the route 10.0.1.0/24 from the CE,We will add it later, PE's VRF can learn。wyh is that?

pguibert6WIND commented 3 years ago

I have already saw that, but not sure it is same as yours. on my case, the outgoing ospf route had no MPLS label assigned, so vrf importation could not happen.

in your case, vpnv4 route is selected, which I presume means that mpls entries are correct. Do you mind if you could dump mpls table + add some troubleshooting with this, pls?

ubuntu1604es# show mpls table ubuntu1604es# debu bgp vpn label ubuntu1604es# debu bgp vpn leak-from-vrf ubuntu1604es# debu bgp vpn leak-to-vrf

and show me the outputs,

thanks,

hothotstreet commented 3 years ago

of course,it's my pleasure.here is the troubleshooting

localhost.localdomain(config)# do show ip route vrf vrf_20006_8 Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP, T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP, F - PBR, f - OpenFabric,

  • selected route, * - FIB route, q - queued route, r - rejected route

VRF vrf_20006_8: K> 0.0.0.0/0 [0/8192] unreachable (ICMP unreachable), 2d01h29m O> 10.0.1.0/24 [110/10010] via 10.251.0.10, tapgw.8, 00:13:58 O 10.251.0.0/16 [110/10000] is directly connected, tapgw.8, 00:14:15 C> 10.251.0.0/16 is directly connected, tapgw.8, 00:14:15 O 10.254.0.0/16 [110/10000] is directly connected, tapgw_sp.8, 00:14:15 C> 10.254.0.0/16 is directly connected, tapgw_sp.8, 00:14:15

localhost.localdomain(config)# do show bgp ipv4 vpn BGP table version is 39, local router ID is 10.254.128.30, vrf id 0 Default local pref 100, local AS 60029 Status codes: s suppressed, d damped, h history, * valid, > best, = multipath, i internal, r RIB-failure, S Stale, R Removed Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path Route Distinguisher: 8:8 > 10.0.0.0/24 10.253.0.50 10010 0 60030 ? UN=10.253.0.50 EC{8:8} label=16 type=bgp, subtype=0 > 10.0.1.0/24 10.251.0.10@9< 10010 32768 ? UN=10.251.0.10 EC{8:8} label=16 type=bgp, subtype=5

Displayed 2 routes and 2 total paths

localhost.localdomain(config)# do show mpls table Inbound Label Type Nexthop Outbound Label

16 BGP vrf_20006_8 -

localhost.localdomain(config)#

localhost.localdomain# debu bgp vpn label enabled debug bgp vpn label

localhost.localdomain# debu bgp vpn leak-from-vrf enabled debug bgp vpn leak-from-vrf

localhost.localdomain# debu bgp vpn leak-to-vrf enabled debug bgp vpn leak-to-vrf

pguibert6WIND commented 3 years ago

it seems that mpls table is incomplete. It looks like your IGP on wan did not react. 'show ip route' ?

hothotstreet commented 3 years ago

it seems that mpls table is incomplete. It looks like your IGP on wan did not react. 'show ip route' ?

localhost.localdomain# show ip route Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP, T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP, F - PBR, f - OpenFabric,

  • selected route, * - FIB route, q - queued route, r - rejected route

K> 0.0.0.0/0 [0/0] via 192.168.1.1, ens160, 2d19h41m C> 10.253.0.48/30 is directly connected, tapbgp_13628, 21:57:04 C> 10.254.128.30/32 is directly connected, lo1, 2d23h21m B> 10.254.128.31/32 [20/0] via 10.253.0.50, tapbgp_13628, 21:57:04 C> 11.9.8.0/21 is directly connected, tun_mgm, 2d23h21m K> 169.254.0.0/16 [0/1002] is directly connected, ens160, 2d19h41m K 169.254.0.0/16 [0/1004] is directly connected, br-lan1, 2d23h21m K 169.254.0.0/16 [0/1003] is directly connected, ens192, 2d23h21m C> 172.16.1.0/24 is directly connected, br-lan1, 2d23h21m C> 172.16.17.0/24 is directly connected, ens192, 2d23h21m C> 172.31.0.0/16 is directly connected, tapgw, 2d23h21m C> 172.32.0.0/16 is directly connected, tapgw_sp, 2d23h21m C 192.168.0.0/23 is directly connected, ens160, 2d19h41m C> 192.168.0.0/23 is directly connected, ens160, 2d19h41m K>* 192.168.0.151/32 [0/0] via 192.168.1.1, ens160, 2d19h41m

localhost.localdomain# show ip bgp vrf vrf_20006_8 ipv4 BGP table version is 39, local router ID is 10.254.0.1, vrf id 9 Default local pref 100, local AS 60029 Status codes: s suppressed, d damped, h history, * valid, > best, = multipath, i internal, r RIB-failure, S Stale, R Removed Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path 10.0.0.0/24 10.253.0.50@0< 10010 0 60030 ? *> 10.0.1.0/24 10.251.0.10 10010 32768 ?

localhost.localdomain# show ip bgp ipv4 vpn BGP table version is 39, local router ID is 10.254.128.30, vrf id 0 Default local pref 100, local AS 60029 Status codes: s suppressed, d damped, h history, * valid, > best, = multipath, i internal, r RIB-failure, S Stale, R Removed Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path Route Distinguisher: 8:8 > 10.0.0.0/24 10.253.0.50 10010 0 60030 ? UN=10.253.0.50 EC{8:8} label=16 type=bgp, subtype=0 > 10.0.1.0/24 10.251.0.10@9< 10010 32768 ? UN=10.251.0.10 EC{8:8} label=16 type=bgp, subtype=5

Displayed 2 routes and 2 total paths

The optimal route using these two commands is different(“show ip bgp vrf vrf_20006_8 ipv4”,“show ip bgp ipv4 vpn”). Is this the cause of the problem? Why is 10.0.0.0/24 not a valid route?

pguibert6WIND commented 3 years ago
 10.0.0.0/24 10.253.0.50@0< 10010 0 60030 ?

on show ip bgp vrf ipv4 , indicates that route to underlay nexthop 10.253.0.50 could not be established.

 C>* 10.253.0.48/30 is directly connected, tapbgp_13628, 21:57:04

shows a route, but without mpls label, and this is why ipv4 route from vrf could not be selected (missing label in 10.253 route).

all looks as if your LDP daemon was not set up. Is LDP still alive ?

hothotstreet commented 3 years ago
 10.0.0.0/24 10.253.0.50@0< 10010 0 60030 ?

on show ip bgp vrf ipv4 , indicates that route to underlay nexthop 10.253.0.50 could not be established.

 C>* 10.253.0.48/30 is directly connected, tapbgp_13628, 21:57:04

shows a route, but without mpls label, and this is why ipv4 route from vrf could not be selected (missing label in 10.253 route).

all looks as if your LDP daemon was not set up. Is LDP still alive ?

yes,it is work good

localhost.localdomain# show mpls ldp neighbor AF ID State Remote Address Uptime ipv4 10.254.128.30 OPERATIONAL 10.254.128.30 3d07h56m

pguibert6WIND commented 3 years ago

"Now I manually close the PE interconnection interface, and then open it again."

I tend to think the problem is an LDP issue. what is the remote PE product, please?

further investigation needs to be done with LDP. 'show mpls ldp ..' dumps before and after the PE interconnection closure should be done.

hothotstreet commented 3 years ago

"Now I manually close the PE interconnection interface, and then open it again."

I tend to think the problem is an LDP issue. what is the remote PE product, please?

further investigation needs to be done with LDP. 'show mpls ldp ..' dumps before and after the PE interconnection closure should be done.

the remote PE product is a centos host installed frr,i do not think it is the lDP issue,because i have to ways to Restore vrf routing,the first way is Re-publish the CP side route, the second way is to rebuild the bgp neighbor between ASBRs

RPBE76 commented 2 years ago

do you fix this ?

hothotstreet commented 2 years ago

do you fix this ? i am not sure ,it is not happened again,Do you have this problem

RPBE76 commented 2 years ago

yes... vpn routes are well propagated but not imported automatically into vrf .. I have to type rt vpn import XX and after that only vpn routes are imported to vhf routing table

hothotstreet commented 2 years ago

There may be a configuration problem, because I have not encountered this problem later, please post your configuration and I will help you analyze it

RPBE76 commented 2 years ago
router bgp 100
 bgp router-id 3.3.3.3
 neighbor 2.2.2.2 remote-as 100
 neighbor 2.2.2.2 update-source 3.3.3.3
 !
 address-family ipv4 vpn
  neighbor 2.2.2.2 activate
  neighbor 2.2.2.2 next-hop-self force
  neighbor 2.2.2.2 next-hop-self
  neighbor 2.2.2.2 soft-reconfiguration inbound
 exit-address-family
exit
!
router bgp 100 vrf test
 !
 address-family ipv4 unicast
  redistribute connected
  rd vpn export 100:200
  rt vpn both 100:200
  export vpn
  import vpn
 exit-address-family
exit
!

Example 94.94.94.94 not imported to vrf 91.91.91.91 well imported

sh ip bgp ipv4 vpn 94.94.94.94 BGP routing table entry for 100:200:94.94.94.94/32, version 43 not allocated Paths: (1 available, best #1) Not advertised to any peer 200 2.2.2.2 from 2.2.2.2 (1.1.1.1) Origin incomplete, metric 0, localpref 100, valid, internal, best (First path received) Extended Community: RT:100:200 Originator: 1.1.1.1, Cluster list: 2.2.2.2 Remote label: 3 Last update: Thu Mar 10 12:14:37 2022

sh ip bgp ipv4 vpn 91.91.91.91 BGP routing table entry for 100:200:91.91.91.91/32, version 39 not allocated Paths: (1 available, best #1) Not advertised to any peer 200 2.2.2.2 from 2.2.2.2 (1.1.1.1) Origin incomplete, metric 0, localpref 100, valid, internal, best (First path received) Extended Community: RT:100:200 Originator: 1.1.1.1, Cluster list: 2.2.2.2 Remote label: 3 Last update: Thu Mar 10 12:02:02 2022

sh ip bgp vrf test 94.94.94.94 BGP routing table entry for 94.94.94.94/32, version 0 Paths: (1 available, no best path) Not advertised to any peer Imported from 100:200:94.94.94.94/32 200 2.2.2.2 (inaccessible) from 0.0.0.0 (163.193.183.1) vrf default(0) announce-nh-self Origin incomplete, metric 0, localpref 100, invalid, sourced, local Extended Community: RT:100:200 Originator: 1.1.1.1, Cluster list: 2.2.2.2 Remote label: 3 Last update: Thu Mar 10 12:14:37 2022

sh ip bgp vrf test 91.91.91.91 BGP routing table entry for 91.91.91.91/32, version 37 Paths: (1 available, best #1, vrf GRX) Not advertised to any peer Imported from 100:200:91.91.91.91/32 200 2.2.2.2 (metric 10) from 0.0.0.0 (163.193.183.1) vrf default(0) announce-nh-self Origin incomplete, metric 0, localpref 100, valid, sourced, local, best (First path received) Extended Community: RT:100:200 Originator: 1.1.1.1, Cluster list: 2.2.2.2 Remote label: 3 Last update: Thu Mar 10 12:02:14 2022

RPBE76 commented 2 years ago

FYI Same config with FRR version 7.5.1 no issue working as expected

hothotstreet commented 2 years ago

FYI Same config with FRR version 7.5.1 no issue working as expected

which version are you

RPBE76 commented 2 years ago

8.1

rera1712 commented 2 years ago

@RPBE76 Is this solved at your end ? I am seeing similar issue with 8.2.2

hothotstreet commented 2 years ago

.1.0/24 [200/10010] via 172.16.17.10, ens19

@RPBE76 Is this solved at your end ? I am seeing similar issue with 8.2.2

I haven't had this problem in other environments since

ton31337 commented 1 year ago

@rera1712 please create another issue ("similar issue"), closing this.