FRRouting / frr

The FRRouting Protocol Suite
https://frrouting.org/
Other
3.36k stars 1.25k forks source link

BFD session is not created when I put the configuration to frr.conf and restart frr. #17408

Closed tufeigunchu closed 2 days ago

tufeigunchu commented 3 days ago

Description

I use vtysh to add a new multihop bgp neighbor, but the neighbor is a dummy ip and can't connect to. I also enable BFD on it, and I can see BFD session via command "show bfd peers". But when I save config to frr.conf and then restart frr, BFD session is empty.

Version

3500X# show version
FRRouting 10.3-dev (3500X) on Linux(6.10.6).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
configured with:
    '--prefix=/opt/frr' '--enable-dev-build' '--enable-user=root'

How to reproduce

  1. build master frr code and install it, enable bgpd and bfdd daemon
  2. start frr with command "sudo /opt/frr/sbin/watchfrr.sh all_start"

Use vtysh to define a dummy multihop ebgp neighbor and enable bfd:

router bgp 1234
 bgp router-id 1.1.1.2
 no bgp ebgp-requires-policy
 no bgp network import-check
 neighbor 192.168.1.222 remote-as 123
 neighbor 192.168.1.222 bfd
 neighbor 192.168.1.222 ebgp-multihop 32
 neighbor 192.168.1.222 update-source 192.168.1.5
exit

check BFD session:

3500X(config-router)# do show bfd peers
BFD Peers:
        peer 192.168.1.222 multihop local-address 192.168.1.5 vrf default
                ID: 3092640229
                Remote ID: 0
                Active mode
                Minimum TTL: 224
                Status: down
                Downtime: 37 minute(s), 8 second(s)
                Diagnostics: ok
                Remote diagnostics: ok
                Peer Type: dynamic
                RTT min/avg/max: 0/0/0 usec
                Local timers:
                        Detect-multiplier: 3
                        Receive interval: 300ms
                        Transmission interval: 300ms
                        Echo receive interval: 50ms
                        Echo transmission interval: disabled
                Remote timers:
                        Detect-multiplier: 3
                        Receive interval: 1000ms
                        Transmission interval: 1000ms
                        Echo receive interval: disabled

Save running-config to file and then restart frr, check BFD session:

3500X(config-router)# do copy running-config startup-config 
Note: this version of vtysh never writes vtysh.conf

Warning: attempting direct configuration write without watchfrr.
File permissions and ownership may be incorrect, or write may fail.

Building Configuration...
Integrated configuration saved to /opt/frr/etc/frr/frr.conf
[OK]
3500X(config-router)# 
jim@3500X ~ $ sudo /opt/frr/sbin/watchfrr.sh all_stop
jim@3500X ~ $ sudo /opt/frr/sbin/watchfrr.sh all_start
2024/11/11 16:46:47 ZEBRA: [QYS7B-8K2EV][EC 100663303] failed to mkdir "/opt/frr/var/lib/frr": No such file or directory
2024/11/11 16:46:47 MGMTD: [QYS7B-8K2EV][EC 100663303] failed to mkdir "/opt/frr/var/lib/frr": No such file or directory
2024/11/11 16:46:47 BGP: [ZG9QC-QRCJZ] failed to mkdir "/var/tmp/frr/bgpd.19318": File exists
2024/11/11 16:46:47 BGP: [M1DC0-ZDNYJ] crashlog and per-thread log buffering unavailable!
2024/11/11 16:46:47 BGP: [QYS7B-8K2EV][EC 100663303] failed to mkdir "/opt/frr/var/lib/frr": No such file or directory
2024/11/11 16:46:47 OSPF: [QYS7B-8K2EV][EC 100663303] failed to mkdir "/opt/frr/var/lib/frr": No such file or directory
2024/11/11 16:46:47 OSPF6: [QYS7B-8K2EV][EC 100663303] failed to mkdir "/opt/frr/var/lib/frr": No such file or directory
2024/11/11 16:46:47 LDP: [QYS7B-8K2EV][EC 100663303] failed to mkdir "/opt/frr/var/lib/frr": No such file or directory
2024/11/11 16:46:47 STATIC: [QYS7B-8K2EV][EC 100663303] failed to mkdir "/opt/frr/var/lib/frr": No such file or directory
2024/11/11 16:46:47 BFD: [QYS7B-8K2EV][EC 100663303] failed to mkdir "/opt/frr/var/lib/frr": No such file or directory
[19344|mgmtd] sending configuration
[19345|zebra] sending configuration
[19348|ospfd] sending configuration
[19349|ospf6d] sending configuration
[19350|ldpd] sending configuration
[19351|bgpd] sending configuration
[19348|ospfd] done
[19349|ospf6d] done
[19350|ldpd] done
[19344|mgmtd] done
[19361|staticd] sending configuration
[19362|bfdd] sending configuration
Waiting for children to finish applying config...
[19351|bgpd] done
[19361|staticd] done
[19362|bfdd] done
[19345|zebra] done
jim@3500X ~ $ sudo /opt/frr/bin/vtysh 

Hello, this is FRRouting (version 10.3-dev).
Copyright 1996-2005 Kunihiro Ishiguro, et al.

3500X# show run
Building configuration...

Current configuration:
!
frr version 10.3-dev
frr defaults traditional
hostname 3500X
log file /tmp/bgp.log
service integrated-vtysh-config
!
debug bgp bfd
!
vrf vrf1
exit-vrf
!
router bgp 1234
 bgp router-id 1.1.1.2
 no bgp ebgp-requires-policy
 no bgp network import-check
 neighbor 192.168.1.222 remote-as 123
 neighbor 192.168.1.222 bfd
 neighbor 192.168.1.222 ebgp-multihop 32
 neighbor 192.168.1.222 update-source 192.168.1.5
exit
!
end
3500X# show bfd peers
BFD Peers:
3500X#

Expected behavior

I want to show the BFD session

Actual behavior

It doesn't show BFD session

Additional context

No response

Checklist

ton31337 commented 3 days ago

Just for curiosity... What happens if you move neighbor 192.168.1.222 bfd below update-source?

tufeigunchu commented 3 days ago

No luck. And "show run" will reorder it to original order

ton31337 commented 3 days ago

Could you also give us the follow logs?

debug bfd peer
debug bfd zebra
debug bfd network
tufeigunchu commented 3 days ago
2024/11/11 17:15:44 MGMTD: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
2024/11/11 17:15:44 MGMTD: [G6NKK-8C6DV] end_config: VTY:0x5555559d2450, pending SET-CFG: 1
2024/11/11 17:15:44 LDP: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
2024/11/11 17:15:44 LDP: [G6NKK-8C6DV] end_config: VTY:0x555555814f00, pending SET-CFG: 0
2024/11/11 17:15:44 OSPF: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
2024/11/11 17:15:44 OSPF: [G6NKK-8C6DV] end_config: VTY:0x5555559e1dd0, pending SET-CFG: 0
2024/11/11 17:15:44 MGMTD: libyang: Failed to open file "/opt/frr/var/lib/frr/commit-20241111171544800251247.json" (No such file or directory).
2024/11/11 17:15:44 MGMTD: [HAWY9-CP261] Failed to open commit history "/opt/frr/var/lib/frr/commit-index.dat" for writing: No such file or directory
2024/11/11 17:15:44 OSPF6: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
2024/11/11 17:15:44 OSPF6: [G6NKK-8C6DV] end_config: VTY:0x5555558e0a20, pending SET-CFG: 0
2024/11/11 17:15:44 BGP: [Z0JSE-7MQK8] _bfd_sess_valid: multi hop but no local address provided
2024/11/11 17:15:44 BGP: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
2024/11/11 17:15:44 BGP: [G6NKK-8C6DV] end_config: VTY:0x555556107e50, pending SET-CFG: 0
2024/11/11 17:15:44 STATIC: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
2024/11/11 17:15:44 STATIC: [G6NKK-8C6DV] end_config: VTY:0x5555556a9680, pending SET-CFG: 0
2024/11/11 17:15:44 BFD: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
2024/11/11 17:15:44 BFD: [G6NKK-8C6DV] end_config: VTY:0x555555704140, pending SET-CFG: 0
2024/11/11 17:15:44 BGP: [P6TNR-ZB6G6] zclient_bfd_session_replay: sending all sessions registered
2024/11/11 17:15:44 ZEBRA: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
2024/11/11 17:15:44 ZEBRA: [G6NKK-8C6DV] end_config: VTY:0x555555c03310, pending SET-CFG: 0
2024/11/11 17:15:44 BFD: [WP2Q1-73DVN] VRF Created: vrf1(17)
2024/11/11 17:15:44 BFD: [QV7KP-RSBE6] VRF enable add vrf1 id 17
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface docker0 (VRF default(0))
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address 172.17.0.1/16 (VRF 0)
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address fe80::42:bdff:fe30:10e5/64 (VRF 0)
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface dummy0 (VRF default(0))
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface eth0 (VRF default(0))
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface eth1 (VRF default(0))
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface lo (VRF default(0))
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface tunl0 (VRF default(0))
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface vethedbb382 (VRF default(0))
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address fe80::d49d:fbff:fe8d:15de/64 (VRF 0)
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface virbr0 (VRF default(0))
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address 192.168.122.1/24 (VRF 0)
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address 10.3.2.1/24 (VRF 0)
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface wlan0 (VRF default(0))
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address 192.168.1.5/24 (VRF 0)
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address 2408:825c:6421:77f7:d10e:f508:3ad5:aab2/64 (VRF 0)
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address 2408:825c:6421:9c6c:6443:cb38:2d92:52a1/64 (VRF 0)
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address fe80::d227:b943:3239:28b4/64 (VRF 0)
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface ztc3qw4fhf (VRF default(0))
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address 192.168.195.228/24 (VRF 0)
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface dummy1 (VRF vrf1(17))
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address 192.168.196.22/24 (VRF 17)
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address fe80::b0fc:76ff:fecf:2021/64 (VRF 17)
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface vrf1 (VRF vrf1(17))
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface docker0 (VRF default(0))
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address 172.17.0.1/16 (VRF 0)
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address fe80::42:bdff:fe30:10e5/64 (VRF 0)
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface dummy0 (VRF default(0))
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface eth0 (VRF default(0))
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface eth1 (VRF default(0))
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface lo (VRF default(0))
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface tunl0 (VRF default(0))
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface vethedbb382 (VRF default(0))
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address fe80::d49d:fbff:fe8d:15de/64 (VRF 0)
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface virbr0 (VRF default(0))
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address 192.168.122.1/24 (VRF 0)
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address 10.3.2.1/24 (VRF 0)
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface wlan0 (VRF default(0))
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address 192.168.1.5/24 (VRF 0)
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address 2408:825c:6421:77f7:d10e:f508:3ad5:aab2/64 (VRF 0)
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address 2408:825c:6421:9c6c:6443:cb38:2d92:52a1/64 (VRF 0)
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address fe80::d227:b943:3239:28b4/64 (VRF 0)
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface ztc3qw4fhf (VRF default(0))
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address 192.168.195.228/24 (VRF 0)
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface dummy1 (VRF vrf1(17))
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address 192.168.196.22/24 (VRF 17)
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address fe80::b0fc:76ff:fecf:2021/64 (VRF 17)
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface vrf1 (VRF vrf1(17))
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface dummy1 (VRF vrf1(17))
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address 192.168.196.22/24 (VRF 17)
2024/11/11 17:15:44 BFD: [SSYGJ-9ZAE0] zclient: add local address fe80::b0fc:76ff:fecf:2021/64 (VRF 17)
2024/11/11 17:15:44 BFD: [GCWEX-N0BBE] zclient: add interface vrf1 (VRF vrf1(17))
2024/11/11 17:15:45 ZEBRA: [V98V0-MTWPF] client 68 says hello and bids fair to announce only bgp routes vrf=0
2024/11/11 17:15:45 BGP: [TXY0T-CYY6F][EC 100663299] Can't get remote address and port: Transport endpoint is not connected
ton31337 commented 3 days ago

_bfd_sess_valid: multi hop but no local address provided indicates that this is really something with the ordering. Not much familiar with the BFD code how it should behave here correctly, but maybe @rzalamena could see something wrong.

rzalamena commented 3 days ago

If I remember correcly: BFD session is only installed after BGP peer connection is moved to state Established. Looking at the end of the logs it seems it never connected:

2024/11/11 17:15:45 BGP: [TXY0T-CYY6F][EC 100663299] Can't get remote address and port: Transport endpoint is not connected
ton31337 commented 3 days ago
diff --git a/bgpd/bgp_bfd.c b/bgpd/bgp_bfd.c
index 14ff5f2e11..1a6431ce92 100644
--- a/bgpd/bgp_bfd.c
+++ b/bgpd/bgp_bfd.c
@@ -302,6 +302,8 @@ void bgp_peer_configure_bfd(struct peer *p, bool manual)
        if (p->nexthop.ifp)
                bfd_sess_set_interface(p->bfd_config->session,
                                       p->nexthop.ifp->name);
+
+       bgp_peer_bfd_update_source(p);
 }

This do the trick when the BFD is configured after update-source/ebgp-multihop stuff. But the ordering matters here also.

This is fine:

 neighbor 192.168.1.222 ebgp-multihop 32
 neighbor 192.168.1.222 update-source 192.168.1.5
 neighbor 192.168.1.222 bfd

This is bad:

 neighbor 192.168.1.222 bfd
 neighbor 192.168.1.222 ebgp-multihop 32
 neighbor 192.168.1.222 update-source 192.168.1.5
ton31337 commented 3 days ago

Looking at https://github.com/FRRouting/frr/issues/17396, I see this is technically the same, and now I can reproduce it.

ton31337 commented 3 days ago

Seems I found the root cause, will push soon.