FRRouting / frr

The FRRouting Protocol Suite
https://frrouting.org/
Other
3.39k stars 1.26k forks source link

BFD not working with OSPF? #2889

Closed chsummers closed 5 years ago

chsummers commented 6 years ago

Hi, i'm testing BFD for rapid OSPF convergence but it seems to me OSPF daemon is ignoring BFD notifications.

This is what i get from BFDd:

BFD Peers:
        peer 1.1.1.1 interface xe1
                Control packet input: 682482 packets
                Control packet output: 647027 packets
                Echo packet input: 0 packets
                Echo packet output: 0 packets
                Session up events: 2
                Session down events: 1
                Zebra notifications: 5

BFDd seems to be working correctly, following peers up and down but OSPF neighbors and routes are not removed when BFD-enabled peer is down, only after the OSPF dead interval is expired.

Maybe there is something i'm doing wrong?

BR

pguibert6WIND commented 6 years ago

Hi @chsummers, is it possible to share the configuration you use, please?

chsummers commented 6 years ago

Sure thing, all IP have been redacted for privacy reasons.

interface lo
 ip address $LOOPBACK_IP
!
interface xe1
 ip address $PTP_IP/31
 ip ospf bfd
 ip ospf network point-to-point
!
router-id $LOOPBACK_IP
!
router ospf
 ospf router-id $LOOPBACK_IP
 log-adjacency-changes
 passive-interface default
 no passive-interface xe1
 network $LOOPBACK_IP/32 area 0
 network $PTP_NET/31 area 0
 capability opaque
!
mpls ldp
 router-id $LOOPBACK_IP
 !
 address-family ipv4
  discovery transport-address $LOOPBACK_IP
  label local allocate for LDP
  label local advertise explicit-null for LDP
  !
  interface xe1
  !
 exit-address-family
 !
!
ip prefix-list LDP seq 10 xxxxxxxxx
ip prefix-list LDP seq 60 deny 0.0.0.0/0
!
bfd
 peer $PTP2_IP interface xe1
  no shutdown
 !
!

Please let me know if you need further information.

Edit: added loopback IP

rzalamena commented 6 years ago

Hello @chsummers ,

Thank you for your report, but I think we are going to need a little bit more information. I see that you have a properly configured BFD peer, however I did not see if the interface has the ip ospf bfd configuration.

Secondly: when I tested bfdd against OSPF I watched the output of show ip ospf route to see if the topology was fast converging or not. When BFD is disabled the routes should hang around until the timer is over, however with BFD enabled they should disapear when it is notified that the peer went down.

EasyNetDev commented 6 years ago

Hi,

Seems that I'm experience the same issue: even the BFD shows the peer down, the routes are still active until the OSPF expires.

R5# show bfd peers 
BFD Peers:
    peer 10.191.0.101 interface tun0
        ID: 3
        Remote ID: 2
        Status: up
        Uptime: 9 second(s)
        Diagnostics: ok
        Remote diagnostics: ok
        Local timers:
            Receive interval: 50ms
            Transmission interval: 50ms
            Echo transmission interval: disabled
        Remote timers:
            Receive interval: 50ms
            Transmission interval: 50ms
            Echo transmission interval: 50ms

R5# show ip route ospf 
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR,
       > - selected route, * - FIB route

O>* 0.0.0.0/0 [110/501] via 10.191.0.101, tun0, 00:00:13
O>* 10.0.0.2/32 [110/20] via 10.191.0.101, tun0, 00:00:12
O>* 10.8.0.0/24 [110/20] via 10.191.0.101, tun0, 00:00:12
O>* 10.8.0.2/32 [110/20] via 10.191.0.101, tun0, 00:00:12
O>* 10.8.1.0/24 [110/20] via 10.191.0.101, tun0, 00:00:12
O>* 10.8.1.2/32 [110/20] via 10.191.0.101, tun0, 00:00:12
O>* 10.8.2.0/24 [110/20] via 10.191.0.101, tun0, 00:00:12
O>* 10.8.2.2/32 [110/20] via 10.191.0.101, tun0, 00:00:12
O>* 10.8.3.0/24 [110/20] via 10.191.0.101, tun0, 00:00:12
O>* 10.8.3.2/32 [110/20] via 10.191.0.101, tun0, 00:00:12
O>* 10.8.4.0/24 [110/20] via 10.191.0.101, tun0, 00:00:12
O>* 10.8.4.2/32 [110/20] via 10.191.0.101, tun0, 00:00:12
O>* 10.8.5.0/24 [110/20] via 10.191.0.101, tun0, 00:00:12
O>* 10.8.5.2/32 [110/20] via 10.191.0.101, tun0, 00:00:12
O>* 10.8.6.0/24 [110/20] via 10.191.0.101, tun0, 00:00:12
O>* 10.8.6.2/32 [110/20] via 10.191.0.101, tun0, 00:00:12
O>* 10.8.7.0/24 [110/20] via 10.191.0.101, tun0, 00:00:12
O>* 10.8.7.2/32 [110/20] via 10.191.0.101, tun0, 00:00:12

Now adding a drop rule in iptables to drop protocol 4 (ipip):

R5# show bfd peers 
BFD Peers:
    peer 10.191.0.101 interface tun0
        ID: 3
        Remote ID: 0
        Status: down
        Downtime: 4 second(s)
        Diagnostics: control detection time expired
        Remote diagnostics: ok
        Local timers:
            Receive interval: 50ms
            Transmission interval: 50ms
            Echo transmission interval: disabled
        Remote timers:
            Receive interval: 50ms
            Transmission interval: 50ms
            Echo transmission interval: 50ms

R5# show ip route ospf 
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR,
       > - selected route, * - FIB route

O>* 0.0.0.0/0 [110/501] via 10.191.0.101, tun0, 00:01:19
O>* 10.0.0.2/32 [110/20] via 10.191.0.101, tun0, 00:01:18
O>* 10.8.0.0/24 [110/20] via 10.191.0.101, tun0, 00:01:18
O>* 10.8.0.2/32 [110/20] via 10.191.0.101, tun0, 00:01:18
O>* 10.8.1.0/24 [110/20] via 10.191.0.101, tun0, 00:01:18
O>* 10.8.1.2/32 [110/20] via 10.191.0.101, tun0, 00:01:18
O>* 10.8.2.0/24 [110/20] via 10.191.0.101, tun0, 00:01:18
O>* 10.8.2.2/32 [110/20] via 10.191.0.101, tun0, 00:01:18
O>* 10.8.3.0/24 [110/20] via 10.191.0.101, tun0, 00:01:18
O>* 10.8.3.2/32 [110/20] via 10.191.0.101, tun0, 00:01:18
O>* 10.8.4.0/24 [110/20] via 10.191.0.101, tun0, 00:01:18
O>* 10.8.4.2/32 [110/20] via 10.191.0.101, tun0, 00:01:18
O>* 10.8.5.0/24 [110/20] via 10.191.0.101, tun0, 00:01:18
O>* 10.8.5.2/32 [110/20] via 10.191.0.101, tun0, 00:01:18
O>* 10.8.6.0/24 [110/20] via 10.191.0.101, tun0, 00:01:18
O>* 10.8.6.2/32 [110/20] via 10.191.0.101, tun0, 00:01:18

R5# show ip ospf neighbor 

Neighbor ID     Pri State           Dead Time Address         Interface            RXmtL RqstL DBsmL
10.190.0.1        1 Full/DROther      17.622s 10.191.0.101    tun0:10.191.0.102        0     0     0
10.190.0.2        1 Init/DROther       9.962s 10.191.0.106    tun1:10.191.0.105        0     0     0

I'm using FRR 6.0.

Kind regards, Adrian

rzalamena commented 6 years ago

Hi @chsummers and @AdrianBan . I've tried to reproduce your problem (routes not being withdraw on BFD down) and I could not see the problem with master and stable/6.0.

This is my configuration.

Here is a tip to help identify OSPF not receiving BFD down notification:

Please, can you tell me if that message appears for you? Also, can you ( @AdrianBan ) provide more details on your configuration?

EasyNetDev commented 6 years ago

Hi @rzalamena,

Unfortunately the FRR logs doesn't shows me anything. BFD is down, after 10 seconds OSPF is going down also, but noting is logged:

R5(config)# do sh run
Building configuration...

Current configuration:
!
frr version 6.0
frr defaults traditional
hostname R5
log monitor warnings
log syslog
log timestamp precision 3
service integrated-vtysh-config
username cumulus nopassword
!
debug ospf nsm
!
...
interface tun0
 bandwidth 200
 description GRE;SR.BB.BUH-ABT.HQ.1;TUNNEL
 ip ospf bfd
 ip ospf dead-interval 10
 ip ospf hello-interval 1
 ip ospf retransmit-interval 3
 multicast
...
router ospf
 ospf router-id 10.191.0.1
 log-adjacency-changes detail
 redistribute connected route-map rm-OSFP-FILTER-out
 network 5.XXX.XXX.16/32 area 0.0.0.10
 network 5.XXX.XXX.18/32 area 0.0.0.10
 network 10.191.0.1/32 area 0.0.0.10
 network 10.191.0.100/30 area 0.0.0.10
 network 10.191.0.104/30 area 0.0.0.10
 network 10.191.0.116/30 area 0.0.0.10
 network 172.17.64.0/23 area 0.0.0.10
 area 10 nssa 
 area 10 nssa no-summary
 neighbor 10.191.0.101
 neighbor 10.191.0.106
...
bfd
 peer 10.191.0.101 interface tun0
  receive-interval 50
  transmit-interval 50
  no shutdown
 !
!
EasyNetDev commented 6 years ago

Hi,

Seems that when I'm configuring the BGP neigbor with BFD, OSPF started also to log those lines:

Dec  3 00:00:02 R5 ospfd[1866]: NSM[tun1:10.191.0.105:10.190.0.2]: Full (PacketReceived)
Dec  3 00:00:02 R5 ospfd[1866]: NSM[tun1:10.191.0.105:10.190.0.2]: Full (2-WayReceived)
Dec  3 00:00:02 R5 ospfd[1866]: NSM[tun0:10.191.0.102:10.190.0.1]: Full (PacketReceived)
Dec  3 00:00:02 R5 ospfd[1866]: NSM[tun0:10.191.0.102:10.190.0.1]: Full (2-WayReceived)
Dec  3 00:00:03 R5 ospfd[1866]: NSM[tun1:10.191.0.105:10.190.0.2]: Full (PacketReceived)
Dec  3 00:00:03 R5 ospfd[1866]: NSM[tun1:10.191.0.105:10.190.0.2]: Full (2-WayReceived)
Dec  3 00:00:03 R5 ospfd[1866]: NSM[tun0:10.191.0.102:10.190.0.1]: Full (PacketReceived)
Dec  3 00:00:03 R5 ospfd[1866]: NSM[tun0:10.191.0.102:10.190.0.1]: Full (2-WayReceived)
Dec  3 00:00:04 R5 ospfd[1866]: NSM[tun1:10.191.0.105:10.190.0.2]: Full (PacketReceived)
Dec  3 00:00:04 R5 ospfd[1866]: NSM[tun1:10.191.0.105:10.190.0.2]: Full (2-WayReceived)
Dec  3 00:00:04 R5 ospfd[1866]: NSM[tun0:10.191.0.102:10.190.0.1]: Full (PacketReceived)
Dec  3 00:00:04 R5 ospfd[1866]: NSM[tun0:10.191.0.102:10.190.0.1]: Full (2-WayReceived)
Dec  3 00:00:05 R5 ospfd[1866]: NSM[tun1:10.191.0.105:10.190.0.2]: Full (PacketReceived)

Is very-very strange! Now I will deconfigure BFD on BGP to see if is still logging NSM.

EasyNetDev commented 6 years ago

Hi,

I've tried to kill the connection and the result is:

Dec  3 00:08:15 R5 ospfd[1866]: NSM[tun1:10.191.0.105:10.190.0.2]: Full (2-WayReceived)
Dec  3 00:08:16 R5 ospfd[1866]: NSM[tun1:10.191.0.105:10.190.0.2]: Full (PacketReceived)
Dec  3 00:08:16 R5 ospfd[1866]: NSM[tun1:10.191.0.105:10.190.0.2]: Full (2-WayReceived)
Dec  3 00:08:17 R5 ospfd[1866]: NSM[tun1:10.191.0.105:10.190.0.2]: Full (PacketReceived)
Dec  3 00:08:17 R5 ospfd[1866]: NSM[tun1:10.191.0.105:10.190.0.2]: Full (2-WayReceived)
Dec  3 00:08:18 R5 ospfd[1866]: NSM[tun1:10.191.0.105:10.190.0.2]: Full (PacketReceived)
Dec  3 00:08:18 R5 ospfd[1866]: NSM[tun1:10.191.0.105:10.190.0.2]: Full (2-WayReceived)
Dec  3 00:08:19 R5 ospfd[1866]: NSM[tun1:10.191.0.105:10.190.0.2]: Full (PacketReceived)
Dec  3 00:08:19 R5 ospfd[1866]: NSM[tun1:10.191.0.105:10.190.0.2]: Full (2-WayReceived)
Dec  3 00:08:19 R5 ospfd[1866]: NSM[tun0:10.191.0.102:10.190.0.1]: Timer (Inactivity timer expire)
Dec  3 00:08:19 R5 ospfd[1866]: NSM[tun0:10.191.0.102:10.190.0.1]: Full (InactivityTimer)
Dec  3 00:08:19 R5 ospfd[1866]: NSM[tun0:10.191.0.102:10.190.0.1]: State change Full -> Deleted (InactivityTimer)
Dec  3 00:08:19 R5 ospfd[1866]: AdjChg: Nbr 10.190.0.1 on tun0:10.191.0.102: Full -> Deleted (InactivityTimer)
Dec  3 00:08:19 R5 ospfd[1866]: nsm_change_state:(10.190.0.1, Full -> Deleted): scheduling new router-LSA origination
Dec  3 00:08:20 R5 ospfd[1866]: NSM[tun1:10.191.0.105:10.190.0.2]: Full (PacketReceived)
Dec  3 00:08:20 R5 ospfd[1866]: NSM[tun1:10.191.0.105:10.190.0.2]: Full (PacketReceived)
Dec  3 00:08:20 R5 ospfd[1866]: NSM[tun1:10.191.0.105:10.190.0.2]: Full (2-WayReceived)
Dec  3 00:08:20 R5 ospfd[1866]: NSM[tun1:10.191.0.105:10.190.0.2]: Full (PacketReceived)
Dec  3 00:08:21 R5 ospfd[1866]: NSM[tun1:10.191.0.105:10.190.0.2]: Full (PacketReceived)
Dec  3 00:08:21 R5 ospfd[1866]: NSM[tun1:10.191.0.105:10.190.0.2]: Full (2-WayReceived)

Still after OSPF expires.

Then when the link is going up:

Dec  3 00:11:33 R5 ospfd[1866]: NSM[tun0:10.191.0.102:10.190.0.1]: Down (PacketReceived)
Dec  3 00:11:33 R5 ospfd[1866]: NSM[tun0:10.191.0.102:10.190.0.1]: State change Down -> Init (PacketReceived)
Dec  3 00:11:33 R5 ospfd[1866]: AdjChg: Nbr 10.190.0.1 on tun0:10.191.0.102: Down -> Init (PacketReceived)
Dec  3 00:11:33 R5 ospfd[1866]: NSM[tun0:10.191.0.102:10.190.0.1]: Init (2-WayReceived)
Dec  3 00:11:33 R5 ospfd[1866]: NSM[tun0:10.191.0.102:10.190.0.1]: State change Init -> ExStart (2-WayReceived)
Dec  3 00:11:33 R5 ospfd[1866]: AdjChg: Nbr 10.190.0.1 on tun0:10.191.0.102: Init -> ExStart (2-WayReceived)
Dec  3 00:11:33 R5 ospfd[1866]: Packet[DD]: Neighbor 10.190.0.1: Initial DBD from Slave, ignoring.
Dec  3 00:11:33 R5 ospfd[1866]: NSM[tun0:10.191.0.102:10.190.0.1]: ExStart (PacketReceived)
Dec  3 00:11:33 R5 ospfd[1866]: Packet[DD]: Neighbor 10.190.0.1 Negotiation done (Master).
Dec  3 00:11:33 R5 ospfd[1866]: NSM[tun0:10.191.0.102:10.190.0.1]: ExStart (NegotiationDone)
Dec  3 00:11:33 R5 ospfd[1866]: NSM[tun0:10.191.0.102:10.190.0.1]: State change ExStart -> Exchange (NegotiationDone)
Dec  3 00:11:33 R5 ospfd[1866]: AdjChg: Nbr 10.190.0.1 on tun0:10.191.0.102: ExStart -> Exchange (NegotiationDone)
Dec  3 00:11:33 R5 ospfd[1866]: NSM[tun0:10.191.0.102:10.190.0.1]: Exchange (PacketReceived)
Dec  3 00:11:33 R5 ospfd[1866]: NSM[tun0:10.191.0.102:10.190.0.1]: Exchange (PacketReceived)
Dec  3 00:11:33 R5 ospfd[1866]: NSM[tun0:10.191.0.102:10.190.0.1]: Exchange (ExchangeDone)
Dec  3 00:11:33 R5 ospfd[1866]: NSM[tun0:10.191.0.102:10.190.0.1]: State change Exchange -> Full (ExchangeDone)
Dec  3 00:11:33 R5 ospfd[1866]: AdjChg: Nbr 10.190.0.1 on tun0:10.191.0.102: Exchange -> Full (ExchangeDone)
Dec  3 00:11:33 R5 ospfd[1866]: nsm_change_state:(10.190.0.1, Exchange -> Full): scheduling new router-LSA origination

With BGP is instant if I'm dropping the packets. OSPF is not working at all :(.

Kind regards, Adrian

EasyNetDev commented 6 years ago

@rzalamena could be an issue that you are using BSD kernel and in my case is Linux 4.18 kernel? Maybe there is an issue of BFD-OSPF communication and BFD doesn't announce correctly the OSPF or OSPF is not registering properly the handling function to get announcements from BFD?

rzalamena commented 6 years ago

@AdrianBan

Unfortunately the FRR logs doesn't shows me anything. BFD is down, after 10 seconds OSPF is going down also, but noting is logged:

Would you mind to get me debugging log for ospfd and bfdd? Here are the commands:

vtysh -d ospfd
configure terminal
debug ospf nsm
debug ospf zebra
log file /tmp/ospfd.log debug

vtysh -d bfdd
configure terminal
log file /tmp/bfdd.log debug

The new OSPF debug should give you a new BFD debug message like this one:

2018/12/03 12:27:08 OSPF: Zebra: interface em1 bfd destination 10.0.1.11/32 Down 

@rzalamena could be an issue that you are using BSD kernel and in my case is Linux 4.18 kernel?

I doubt it, the only difference between BSD and Linux is the underlying packet capture mechanism. The master branch should have it standardized (no more raw sockets).

Maybe there is an issue of BFD-OSPF communication and BFD doesn't announce correctly the OSPF or OSPF is not registering properly the handling function to get announcements from BFD?

This is what I'm investigating. bfdd is a port from Cumulus PTM daemon, I might have gotten things wrong when porting it natively to FRR.


I've build a setup just like yours and I could not reproduce the problem :( , the only different is that I simulate link down using the interface state (e.g. ifconfig em1 down).

Thanks for your help!

EasyNetDev commented 6 years ago

Hi rzalamena,

What I've simulated it was to drop protocol 4 (IPIP) in one side like:

iptables - A OUTPUT - d 89.XXX.XXX.208 - j DROP

Is simulating unidirectional communication. Also I've tried with:

ip link set tun0 down

In one router and the debug was on the far-end. Same result, OSPF time expire was the result, nothing about BFD, even peer was down.

I will try the debugging settings which you written earlier.

Kind regards, Adrian

EasyNetDev commented 6 years ago

Hi @rzalamena,

I'm attaching 4 files:

  1. The output of BFD logs, but I'm simulating UDL using iptables -A OUTPUT -d 89.XXX.XXX.208 -j DROP command on far-end, like the far-end is still capable to receive packets from R5 router but R5 is not receiving any kind of traffic from R1.
  2. Using link down: ip link set tun0 down in far-end.

link-down - BFD logs.txt link-down - OSPFD logs.txt UDL using iptables - BFD logs.txt UDL using iptables - OSPFD logs.txt

Using link down seems to work. Zebra is switching to alternative path quickly. In case of UDL seems that is nothing announced.

rzalamena commented 6 years ago

@AdrianBan I think I found the problem, however I don't know yet how to fix it if you confirm it.

It seems that BFD is only working with NBMA (Non-Broadcast Multi Access) OSPF networks. I've tried the same OSPF setup, but set network type to anything not NBMA and BFD stops working.

Here is something for you to try: configure your interface to use NBMA and test BFD again.

interface tun0
 ip ospf network non-broadcast
!

I went through the code path and found the following hint: when configuring a OSPF neighbor in an interface without NBMA, the neighbor doesn't get added to a data structure here ( https://github.com/FRRouting/frr/blob/master/ospfd/ospfd.c#L1940 ).

When the BFD notification comes up you get to see the log message (OSPF: Zebra: interface XXX bfd destination A.B.C.D/M YYYY), however the connection state doesn't change. That is because OSPF interface data structure doesn't have a pointer to this neighbor referece here ( https://github.com/FRRouting/frr/blob/master/ospfd/ospf_bfd.c#L232 ).

EasyNetDev commented 6 years ago

@rzalamena so indeed it is an bug. :) Lucky that I've found it.

EasyNetDev commented 5 years ago

@rzalamena here is the output of the OPFS after switching the interfaces to non-broadcast:

R2:

 Neighbor 10.191.0.1, interface address 10.191.0.105
    In the area 0.0.0.10 [NSSA] via interface tun0
    Neighbor priority is 1, State is Full, 6 state changes
    Most recent state change statistics:
      Progressive change 1m27s ago
    DR is 10.191.0.105, BDR is 10.191.0.106
    Options 8 *|-|-|-|N/P|-|-|-
    Dead timer due in 36.732s
    Database Summary List 0
    Link State Request List 0
    Link State Retransmission List 0
    Thread Inactivity Timer on
    Thread Database Description Retransmision off
    Thread Link State Request Retransmission off
    Thread Link State Update Retransmission on

    BFD: Type: single hop
      Detect Multiplier: 3, Min Rx interval: 300, Min Tx interval: 300
      Status: Up, Last update: 0:00:00:02

R5

 Neighbor 10.190.0.1, interface address 10.191.0.101
    In the area 0.0.0.10 [NSSA] via interface tun0
    Neighbor priority is 1, State is Full, 5 state changes
    Most recent state change statistics:
      Progressive change 5h56m29s ago
    DR is 0.0.0.0, BDR is 0.0.0.0
    Options 8 *|-|-|-|N/P|-|-|-
    Dead timer due in 36.744s
    Database Summary List 0
    Link State Request List 0
    Link State Retransmission List 0
    Thread Inactivity Timer on
    Thread Database Description Retransmision off
    Thread Link State Request Retransmission on
    Thread Link State Update Retransmission on

    BFD: Type: single hop
      Detect Multiplier: 3, Min Rx interval: 300, Min Tx interval: 300
      Status: Unknown, Last update: never

 Neighbor 10.190.0.2, interface address 10.191.0.106
    In the area 0.0.0.10 [NSSA] via interface tun1
    Neighbor priority is 1, State is Full, 7 state changes
    Most recent state change statistics:
      Progressive change 4.356s ago
      Regressive change 44.387s ago, due to 1-WayReceived
    DR is 10.191.0.105, BDR is 10.191.0.106
    Options 8 *|-|-|-|N/P|-|-|-
    Dead timer due in 36.603s
    Database Summary List 0
    Link State Request List 0
    Link State Retransmission List 0
    Thread Inactivity Timer on
    Thread Database Description Retransmision off
    Thread Link State Request Retransmission off
    Thread Link State Update Retransmission on

    BFD: Type: single hop
      Detect Multiplier: 3, Min Rx interval: 300, Min Tx interval: 300
      Status: Up, Last update: 0:00:01:10
rzalamena commented 5 years ago

Hi @AdrianBan , a lot of fixes just went in master (see PR #3918 ). Would you mind to try it again on master?

EasyNetDev commented 5 years ago

Hi @rzalamena,

I've installed FRR 7.0 on R2 and R5. The results:

R2:

Buh-R2# show ip ospf interface tun0
tun0 is up
  ifindex 27, MTU 1480 bytes, BW 200 Mbit <UP,POINTOPOINT,RUNNING,NOARP,MULTICAST>
  Internet Address 10.191.0.106/30, Broadcast 10.191.0.107, Area 0.0.0.10 [NSSA]
  MTU mismatch detection: enabled
  Router ID 10.190.0.2, Network Type POINTOPOINT, Cost: 500
  Transmit Delay is 1 sec, State Point-To-Point, Priority 1
  No backup designated router on this network
  Multicast group memberships: OSPFAllRouters
  Timer intervals configured, Hello 2s, Dead 10s, Wait 10s, Retransmit 3
    Hello due in 1.153s
  Neighbor Count is 1, Adjacent neighbor count is 1
  BFD: Detect Multiplier: 3, Min Rx interval: 300, Min Tx interval: 300

Buh-R2# show ip ospf neighbor 10.191.0.1
 Neighbor 10.191.0.1, interface address 10.191.0.105
    In the area 0.0.0.10 [NSSA] via interface tun0
    Neighbor priority is 1, State is Full, 8 state changes
    Most recent state change statistics:
      Progressive change 4m33s ago
      Regressive change 4m37s ago, due to SeqNumberMismatch
    DR is 0.0.0.0, BDR is 0.0.0.0
    Options 8 *|-|-|-|N/P|-|-|-
    Dead timer due in 9.784s
    Database Summary List 0
    Link State Request List 0
    Link State Retransmission List 0
    Thread Inactivity Timer on
    Thread Database Description Retransmision off
    Thread Link State Request Retransmission off
    Thread Link State Update Retransmission on

    BFD: Type: single hop
      Detect Multiplier: 3, Min Rx interval: 300, Min Tx interval: 300
      Status: Unknown, Last update: never

Buh-R2# show bfd peers 
BFD Peers:
    peer 10.191.0.105 interface tun0
        ID: 1
        Remote ID: 2
        Status: up
        Uptime: 5 minute(s), 1 second(s)
        Diagnostics: ok
        Remote diagnostics: ok
        Local timers:
            Receive interval: 50ms
            Transmission interval: 50ms
            Echo transmission interval: disabled
        Remote timers:
            Receive interval: 50ms
            Transmission interval: 50ms
            Echo transmission interval: 50ms

R5:

R5-pol# show ip ospf interface tun1
tun1 is up
  ifindex 26, MTU 1480 bytes, BW 0 Mbit <UP,POINTOPOINT,RUNNING,NOARP,MULTICAST>
  Internet Address 10.191.0.105/30, Area 0.0.0.10 [NSSA]
  MTU mismatch detection: enabled
  Router ID 10.191.0.1, Network Type POINTOPOINT, Cost: 10
  Transmit Delay is 1 sec, State Point-To-Point, Priority 1
  No backup designated router on this network
  Multicast group memberships: OSPFAllRouters
  Timer intervals configured, Hello 2s, Dead 10s, Wait 10s, Retransmit 3
    Hello due in 1.277s
  Neighbor Count is 1, Adjacent neighbor count is 1
  BFD: Detect Multiplier: 3, Min Rx interval: 300, Min Tx interval: 300

R5-pol# show ip ospf neighbor 10.190.0.2
    Neighbor 10.190.0.2, interface address 10.191.0.106
    In the area 0.0.0.10 [NSSA] via interface tun1
    Neighbor priority is 1, State is Full, 5 state changes
    Most recent state change statistics:
      Progressive change 7m22s ago
    DR is 0.0.0.0, BDR is 0.0.0.0
    Options 8 *|-|-|-|N/P|-|-|-
    Dead timer due in 8.060s
    Database Summary List 0
    Link State Request List 0
    Link State Retransmission List 0
    Thread Inactivity Timer on
    Thread Database Description Retransmision off
    Thread Link State Request Retransmission on
    Thread Link State Update Retransmission on

    BFD: Type: single hop
      Detect Multiplier: 3, Min Rx interval: 300, Min Tx interval: 300
      Status: Unknown, Last update: never

R5-pol# show bfd peer 
BFD Peers:
        peer 10.191.0.106 interface tun1
        ID: 2
        Remote ID: 1
        Status: up
        Uptime: 5 minute(s), 55 second(s)
        Diagnostics: ok
        Remote diagnostics: ok
        Local timers:
            Receive interval: 50ms
            Transmission interval: 50ms
            Echo transmission interval: 50ms
        Remote timers:
            Receive interval: 50ms
            Transmission interval: 50ms
            Echo transmission interval: 50ms

I will do some test, but I'm curious if the OSPF neighbor is ok:

    BFD: Type: single hop
      Detect Multiplier: 3, Min Rx interval: 300, Min Tx interval: 300
      Status: Unknown, Last update: never

Kind regards, Adrian

EasyNetDev commented 5 years ago

Hi @rzalamena,

Seems that is still not working :( :

Before test: R5

R5-pol# show bfd peers 
BFD Peers:
    peer 10.191.0.106 interface tun1
        ID: 2
        Remote ID: 1
        Status: up
        Uptime: 7 second(s)
        Diagnostics: ok
        Remote diagnostics: ok
        Local timers:
            Receive interval: 50ms
            Transmission interval: 50ms
            Echo transmission interval: 50ms
        Remote timers:
            Receive interval: 50ms
            Transmission interval: 50ms
            Echo transmission interval: 50ms
R5-pol# show ip ospf neighbor 

Neighbor ID     Pri State           Dead Time Address         Interface            RXmtL RqstL DBsmL
10.190.0.1        1 Full/DROther       9.533s 10.191.0.101    tun0:10.191.0.102        0     0     0
10.190.0.2        1 Full/DROther      39.427s 10.191.0.106    tun1:10.191.0.105        0     0     0

R5-pol# show ip ospf interface tun1
tun1 is up
  ifindex 26, MTU 1480 bytes, BW 0 Mbit <UP,POINTOPOINT,RUNNING,NOARP,MULTICAST>
  Internet Address 10.191.0.105/30, Area 0.0.0.10 [NSSA]
  MTU mismatch detection: enabled
  Router ID 10.191.0.1, Network Type POINTOPOINT, Cost: 10
  Transmit Delay is 1 sec, State Point-To-Point, Priority 1
  No backup designated router on this network
  Multicast group memberships: OSPFAllRouters
  Timer intervals configured, Hello 10s, Dead 40s, Wait 40s, Retransmit 5
    Hello due in 3.716s
  Neighbor Count is 1, Adjacent neighbor count is 1
  BFD: Detect Multiplier: 3, Min Rx interval: 300, Min Tx interval: 300

Adding on R2 dropping traffic to the destination endpoint tunnel: iptables -A OUTPUT -d 89.X.X.208 -p 4 -j DROP

R5 after the traffic drop:

R5-pol# show bfd peers 
BFD Peers:
    peer 10.191.0.106 interface tun1
        ID: 2
        Remote ID: 0
        Status: down
        Downtime: 3 second(s)
        Diagnostics: control detection time expired
        Remote diagnostics: ok
        Local timers:
            Receive interval: 50ms
            Transmission interval: 50ms
            Echo transmission interval: 50ms
        Remote timers:
            Receive interval: 50ms
            Transmission interval: 50ms
            Echo transmission interval: 50ms
R5-pol# show ip ospf neighbor 

Neighbor ID     Pri State           Dead Time Address         Interface            RXmtL RqstL DBsmL
10.190.0.1        1 Full/DROther       9.961s 10.191.0.101    tun0:10.191.0.102        0     0     0
10.190.0.2        1 Full/DROther      34.806s 10.191.0.106    tun1:10.191.0.105        0     0     0
R5-pol# show ip ospf interface tun1
tun1 is up
  ifindex 26, MTU 1480 bytes, BW 0 Mbit <UP,POINTOPOINT,RUNNING,NOARP,MULTICAST>
  Internet Address 10.191.0.105/30, Area 0.0.0.10 [NSSA]
  MTU mismatch detection: enabled
  Router ID 10.191.0.1, Network Type POINTOPOINT, Cost: 10
  Transmit Delay is 1 sec, State Point-To-Point, Priority 1
  No backup designated router on this network
  Multicast group memberships: OSPFAllRouters
  Timer intervals configured, Hello 10s, Dead 40s, Wait 40s, Retransmit 5
    Hello due in 8.680s
  Neighbor Count is 1, Adjacent neighbor count is 1
  BFD: Detect Multiplier: 3, Min Rx interval: 300, Min Tx interval: 300
R5-pol# sh ip route 
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route

O>* 0.0.0.0/0 [110/11] via 10.191.0.101, tun0, 00:00:38
  *                    via 10.191.0.106, tun1, 00:00:38

The route is still active until OSPF expires.

Kind regards, Adrian

EasyNetDev commented 5 years ago

Hi,

Today I will do some tests in my lab platforms and also I will test the PIM with BFD. I've notice some issues also in PIM.

Kind regards, Adrian

EasyNetDev commented 5 years ago

Hi all,

I see random behaviors. Sometimes BFD is signaling correctly the OSPF process on ethernet interfaces (one test was ok), the other tests BFD went down, but OSPF not:

Every 1.0s: iface=ens256; vtysh -c "show ip route ospf"; vtysh -c "show ip ospf int $iface"; vtysh -c "show bfd peer 10.180.0.198 int $iface"                         FRR-1: Fri Mar 29 23:10:41 2019

Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route

O   10.180.0.14/32 [110/10] via 0.0.0.0, lo0 onlink, 01w1d04h
O>* 10.180.0.15/32 [110/20] via 10.180.0.198, ens256, 00:01:07
O   10.180.0.196/30 [110/10] is directly connected, ens256, 01w1d04h
O>* 10.180.0.200/30 [110/20] via 10.180.0.198, ens256, 00:01:06
O>* 10.180.0.204/30 [110/110] via 10.180.0.198, ens256, 00:01:07
ens256 is up
  ifindex 5, MTU 1500 bytes, BW 10000 Mbit <UP,BROADCAST,RUNNING,MULTICAST>
  Internet Address 10.180.0.197/30, Broadcast 10.180.0.199, Area 0.0.0.0
  MTU mismatch detection: enabled
  Router ID 10.180.0.14, Network Type POINTOPOINT, Cost: 10
  Transmit Delay is 1 sec, State Point-To-Point, Priority 1
  No backup designated router on this network
  Saved Network-LSA sequence number 0x80000002
  Multicast group memberships: OSPFAllRouters
  Timer intervals configured, Hello 10s, Dead 40s, Wait 40s, Retransmit 5
    Hello due in 2.459s
  Neighbor Count is 1, Adjacent neighbor count is 1
  BFD: Detect Multiplier: 3, Min Rx interval: 300, Min Tx interval: 300

BFD Peer:
        peer 10.180.0.198 interface ens256
                ID: 1
                Remote ID: 2
                Status: init
                Diagnostics: neighbor signaled session down
                Remote diagnostics: control detection time expired
                Local timers:
                        Receive interval: 300ms
                        Transmission interval: 300ms
                        Echo transmission interval: disabled
                Remote timers:
                        Receive interval: 300ms
                        Transmission interval: 300ms
                        Echo transmission interval: 50ms

R1:

interface ens256
 description to_FRR-2
 ip address 10.180.0.197/30
 ip ospf bfd
 ip ospf network point-to-point
 ip pim
 ip pim bfd
 multicast
!
interface lo0
 description Router ID
 ip address 10.180.0.14/32
 ip pim
 multicast
!
interface pimreg
 ip igmp
 ip pim
!
router ospf
 ospf router-id 10.180.0.14
 redistribute connected
 network 10.180.0.14/32 area 0
 network 10.180.0.192/30 area 0
 network 10.180.0.196/30 area 0
!
bfd
 peer 10.180.0.198 interface ens256
  no shutdown
 !

R2:

interface ens224
 description to FRR-1
 ip address 10.180.0.198/30
 ip ospf bfd
 ip ospf network point-to-point
 ip pim
 ip pim bfd
 multicast
!
interface ens256
 description to_MCAST_RCV
 ip address 10.180.0.201/30
 multicast
!
interface lo0
 description Router ID
 ip address 10.180.0.15/32
 ip pim
 multicast
!
interface tun0
 bandwidth 1000
 ip address 10.180.0.205/30
 ip ospf bfd
 ip pim
 ip pim bfd
!
interface ens192
 description INTERNET ACCESS
!
router ospf
 ospf router-id 10.180.0.15
 redistribute connected
 network 10.180.0.15/32 area 0
 network 10.180.0.196/30 area 0
 network 10.180.0.204/30 area 0
!
line vty
!
 peer 10.180.0.206 interface tun0
  no shutdown
 !
 peer 10.180.0.197 interface ens224
  no shutdown
 !
 peer 10.180.0.202 interface ens256
  no shutdown
 !
!
end

I've added in R2 those lines to drop the traffic:

root@FRR-2:~# iptables -nvL
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
 9375  591K DROP       all  --  ens256 *       0.0.0.0/0            0.0.0.0/0           
 2401  128K DROP       all  --  ens224 *       0.0.0.0/0            0.0.0.0/0           

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
root@FRR-2:~# 
EasyNetDev commented 5 years ago

Hi,

On ethernet interfaces if the ospf network is BROADCAST, BFD is triggering the OSPF process very quickly. So, seems that the OSPF network POINT-TO-POINT is still not working correctly :(.

Kind regards, Adrian

EasyNetDev commented 5 years ago

Today I've compiled the git version of the FRR. The issue on POINT-TO-POINT interfaces is still exist. BFD is not triggering at all the OSPF process shutdown the invalid routes. PIM for example is triggered instantly by BFD.

EasyNetDev commented 5 years ago

Hi all,

Yesterday I've compiled the version 7.2-dev-20190625-05-g0d7b9179a of FRR. The problem with BFD on OSPF interfaces with network type POINT-TO-POINT is still present. Is doesn't trigger the OSPF session on that specific interface.

Kind regards, Adrian

lucize commented 5 years ago

Hi @AdrianBan , did you tried the new xfrm interface available in >4.19 kernel usable with newer strongswan and libreswan ? I'm also interested in this but for the moment I can only use VTI interfaces (point-to-point) and can't test this issue

EasyNetDev commented 5 years ago

@licize

No, I didn't try it, because iproute2 is not supporting yet the xfrm interfaces and Debian 10 Strongswan packages are not including xfrmi tool, I couldn't test it.

lucize commented 5 years ago

@AdrianBan ended up using the standalone daemon https://github.com/rzalamena/bfdd for it to compile with musl I had to comment #include <linux/ipv6.h> in bfd_packet.c and the typo #define BFD_SRCPORTINIT 49142 to #define BFD_SRCPORTINIT 49152 from bfd.h ospfd seems to react to changes

ghost commented 5 years ago

@lucize using your comment I was able to compile BFD from that git link, but just replacing the current BFD in FRR with that one doesn't work. I'm a bit of a newbie, and sorry for that, but can you tell me what the next step is after compiling to get this to work with ospf?

I am having similar problems with OSPF and BFD being slow to respond to link downs.

lucize commented 5 years ago

@cabsil22 so, just disable bfd config in frr and use bfd like in this examples (the json and the init) https://github.com/openwrt/packages/tree/master/net/bfdd/files the init is openwrt adapted but I think you can understand how to run it

lucize commented 5 years ago

if you need more explicit example

{
  "ipv4": [
    {
      "multihop": false,
      "peer-address": "192.168.200.1",
      "local-address": "192.168.200.2",
      "local-interface": "vti-vti0",
      "label": "tunnel0",
      "detect-multiplier": 3,
      "receive-interval": 250,
      "transmit-interval": 250,
      "echo-mode": false,
    }
    ,
    {
      "multihop": false,
      "peer-address": "192.168.200.5",
      "local-address": "192.168.200.6",
      "local-interface": "vti-vti1",
      "label": "tunnel1",
      "detect-multiplier": 3,
      "receive-interval": 250,
      "transmit-interval": 250,
      "echo-mode": false,
    }
    ,
    {
      "create-only": true,
      "multihop": false,
      "peer-address": "192.168.201.1",
      "local-address": "192.168.201.2",
      "local-interface": "vti-vti2",
      "label": "tunnel2",
      "detect-multiplier": 3,
      "receive-interval": 250,
      "transmit-interval": 250,
      "echo-mode": false,
    }
    ,
    {
      "create-only": true,
      "multihop": false,
      "peer-address": "192.168.201.17",
      "local-address": "192.168.201.18",
      "local-interface": "vti-vti3",
      "label": "tunnel3",
      "detect-multiplier": 3,
      "receive-interval": 250,
      "transmit-interval": 250,
      "echo-mode": false,
    }
  ]
}
EasyNetDev commented 5 years ago

Hi all,

This weekend I've rebuild the latest FRR. I've notice some improvements on OSPF + P2P + BFD, but is still not working 100%. Example:

R01:

interface tun11
 bandwidth 200
 description IPIP;R10;VPN;
 ip address 10.193.1.101/30
 ip ospf bfd
 ip ospf cost 500
 ip pim
 ip pim bfd
 multicast
 link-params
  enable
  packet-loss 0.01
  exit-link-params

R10:

interface tun0
 bandwidth 200
 description IPIP R10-R01 interco
 ip address 10.193.1.102/30
 ip ospf bfd
 ip ospf cost 500
 ip pim
 ip pim bfd
 multicast
 link-params
  enable
  packet-loss 0.01
  exit-link-params

Dropping the tunnel connectivity on R10:

iptables -A INPUT -s R01_SOURCE_ip -p 4 -j DROP

Checking the status of OSPF neighbor on R10:

R10# show ip ospf neighbor 10.190.0.1

R10# 

So no info. Bring up the connectivity on R10:

iptables -A INPUT -s R01_SOURCE_ip -p 4 -j DROP

Checking the OSPF neighbor:

R10# show ip ospf neighbor 10.190.0.1
 Neighbor 10.190.0.1, interface address 10.193.1.101
    In the area 0.0.0.10 via interface tun0
    Neighbor priority is 1, State is Full, 4 state changes
    Most recent state change statistics:
      Progressive change 0.259s ago
    DR is 0.0.0.0, BDR is 0.0.0.0
    Options 2 *|-|-|-|-|-|E|-
    Dead timer due in 39.743s
    Database Summary List 0
    Link State Request List 0
    Link State Retransmission List 1
    Thread Inactivity Timer on
    Thread Database Description Retransmision off
    Thread Link State Request Retransmission off
    Thread Link State Update Retransmission on

    BFD: Type: single hop
      Detect Multiplier: 3, Min Rx interval: 300, Min Tx interval: 300
      Status: Unknown, Last update: never

Status Unknown. Dropping again the tunnel traffic, suddenly OSPF detects the neighbor as DOWN:

R10# show ip ospf neighbor 10.190.0.1
 Neighbor 10.190.0.1, interface address 10.193.1.101
    In the area 0.0.0.10 via interface tun0
    Neighbor priority is 1, State is Full, 4 state changes
    Most recent state change statistics:
      Progressive change 47.675s ago
    DR is 0.0.0.0, BDR is 0.0.0.0
    Options 2 *|-|-|-|-|-|E|-
    Dead timer due in 32.102s
    Database Summary List 0
    Link State Request List 0
    Link State Retransmission List 0
    Thread Inactivity Timer on
    Thread Database Description Retransmision off
    Thread Link State Request Retransmission off
    Thread Link State Update Retransmission on

    BFD: Type: single hop
      Detect Multiplier: 3, Min Rx interval: 300, Min Tx interval: 300
      Status: Down, Last update: 0:00:00:02

Bringing up the traffic, OSPF detects the neighbor as UP!

R10# show ip ospf neighbor 10.190.0.1
 Neighbor 10.190.0.1, interface address 10.193.1.101
    In the area 0.0.0.10 via interface tun0
    Neighbor priority is 1, State is Full, 4 state changes
    Most recent state change statistics:
      Progressive change 14.513s ago
    DR is 0.0.0.0, BDR is 0.0.0.0
    Options 2 *|-|-|-|-|-|E|-
    Dead timer due in 26.261s
    Database Summary List 0
    Link State Request List 0
    Link State Retransmission List 0
    Thread Inactivity Timer on
    Thread Database Description Retransmision off
    Thread Link State Request Retransmission off
    Thread Link State Update Retransmission on

    BFD: Type: single hop
      Detect Multiplier: 3, Min Rx interval: 300, Min Tx interval: 300
      Status: Up, Last update: 0:00:00:02

And is acting instantly if I'm dropping the traffic. But if the hello time is expiring, OSPF is deleting the OSPF neighbor from the list and when is coming back is in the Unknown state.

So somehow a part of the issue was solved, but not the part with setting the neighbor in correct state when the adjacency is setup.

Kind regards, Adrian

ghost commented 5 years ago

@lucize even with the BFDD daemon from https://github.com/rzalamena/bfdd, I am having trouble with OSPF not getting the updated peer status. I can see the BFDD daemon flag the peer as down, but OSPF does not react. Is there a step that I missed?

My routing config is simple:

Current configuration: ! frr version 7.1 frr defaults traditional log syslog informational no ip forwarding no ipv6 forwarding service integrated-vtysh-config ! ip route 8.8.4.4/32 eth0 ! interface gre1 ip ospf bfd ip ospf cost 15 ! interface gre3 ip ospf bfd ip ospf cost 20 ! router ospf ospf abr-type standard network 10.99.80.1/32 area 0.0.0.0 network 10.99.94.0/30 area 0.0.0.0 network 10.99.98.0/30 area 0.0.0.0 area 0.0.0.0 range 10.99.94.0/30 area 0.0.0.0 range 10.99.98.0/30 capability opaque ! line vty ! end

And simple BFDD config for the stand alone daemon.

Any help you can provide to get a working OSPF/BFD set up until the FRR tree is working would be greatly appreciated.

lucize commented 5 years ago

can you activate log-adjacency-changes detail in router ospf ? you should see some ospf action even when you restart bfdd daemon

for me when bfd can't chat in time is like this

Tue Oct 22 20:40:48 2019 daemon.info bfdd[4818]: Session 0x2 down peer 192.168.201.13 Rsn DetectTime prev st Up
Tue Oct 22 20:40:48 2019 daemon.info bfdd[4818]: bfd_recvtimer_cb Detect timeout on session 0x2 with peer 192.168.201.13, in state 1
Tue Oct 22 20:40:48 2019 daemon.info bfdd[4818]: BFD Sess 2 [192.168.201.13] Old State [Up] : New State [Down]
Tue Oct 22 20:40:49 2019 daemon.notice ospfd[23781]: AdjChg: Nbr 172.27.28.1 on vti-vti3:192.168.201.14: Init -> ExStart (2-WayReceived)
Tue Oct 22 20:40:49 2019 daemon.info ospfd[23781]: Packet[DD]: Neighbor 172.27.28.1: Initial DBD from Slave, ignoring.
Tue Oct 22 20:40:49 2019 daemon.info ospfd[23781]: Packet[DD]: Neighbor 172.27.28.1 Negotiation done (Master).
Tue Oct 22 20:40:49 2019 daemon.notice ospfd[23781]: AdjChg: Nbr 172.27.28.1 on vti-vti3:192.168.201.14: ExStart -> Exchange (NegotiationDone)
Tue Oct 22 20:40:49 2019 daemon.notice ospfd[23781]: AdjChg: Nbr 172.27.28.1 on vti-vti3:192.168.201.14: Exchange -> Full (ExchangeDone)
Tue Oct 22 20:40:49 2019 daemon.info ospfd[23781]: nsm_change_state:(172.27.28.1, Exchange -> Full): scheduling new router-LSA origination
Tue Oct 22 20:40:49 2019 daemon.info bfdd[4818]: BFD Sess 2 [192.168.201.13] Old State [Down] : New State [Init]
Tue Oct 22 20:40:49 2019 daemon.info bfdd[4818]: Session 0x2 up peer 192.168.201.13
Tue Oct 22 20:40:49 2019 daemon.info bfdd[4818]: BFD Sess 2 [192.168.201.13] Old State [Init] : New State [Up]
rzalamena commented 5 years ago

Have you guys tried this new PR #5258 ? If not, please try it and let us know.

lucize commented 5 years ago

@rzalamena so it seems to work, I've tested against a fortigate, enabling and disabling bfd on fortigate ipsec ospf interface, and every action on fortigate triggered ospf

hu Oct 31 19:26:19 2019 daemon.info bfdd[3099]: session-new: mhop:no peer:192.168.202.1 local:0.0.0.0 vrf:default ifname:vti-vti0
Thu Oct 31 19:31:58 2019 daemon.notice ospfd[2764]: AdjChg: Nbr 192.168.4.10 on vti-vti0:192.168.202.2: Full -> Init (1-WayReceived)
Thu Oct 31 19:31:58 2019 daemon.info ospfd[2764]: nsm_change_state:(192.168.4.10, Full -> Init): scheduling new router-LSA origination
Thu Oct 31 19:31:58 2019 daemon.info bfdd[3099]: session-delete: mhop:no peer:192.168.202.1 local:0.0.0.0 vrf:default ifname:vti-vti0
Thu Oct 31 19:31:58 2019 daemon.info bfdd[3099]: state-change: [mhop:no peer:192.168.202.1 local:0.0.0.0 vrf:default ifname:vti-vti0] admin-down -> down reason:path-down
Thu Oct 31 19:32:06 2019 daemon.notice ospfd[2764]: AdjChg: Nbr 192.168.4.10 on vti-vti0:192.168.202.2: Init -> ExStart (2-WayReceived)
Thu Oct 31 19:32:06 2019 daemon.info ospfd[2764]: Packet[DD]: Neighbor 192.168.4.10: Initial DBD from Slave, ignoring.
Thu Oct 31 19:32:06 2019 daemon.info bfdd[3099]: session-new: mhop:no peer:192.168.202.1 local:0.0.0.0 vrf:default ifname:vti-vti0
Thu Oct 31 19:32:06 2019 daemon.info ospfd[2764]: Packet[DD]: Neighbor 192.168.4.10 Negotiation done (Master).
Thu Oct 31 19:32:06 2019 daemon.notice ospfd[2764]: AdjChg: Nbr 192.168.4.10 on vti-vti0:192.168.202.2: ExStart -> Exchange (NegotiationDone)
Thu Oct 31 19:32:06 2019 daemon.notice ospfd[2764]: AdjChg: Nbr 192.168.4.10 on vti-vti0:192.168.202.2: Exchange -> Full (ExchangeDone)
Thu Oct 31 19:32:06 2019 daemon.info ospfd[2764]: nsm_change_state:(192.168.4.10, Exchange -> Full): scheduling new router-LSA origination
Thu Oct 31 19:33:43 2019 daemon.debug bfdd[3099]:  peer 192.168.202.1 found, but loc-addr 192.168.202.2 ignored
Thu Oct 31 19:33:44 2019 daemon.info bfdd[3099]: state-change: [mhop:no peer:192.168.202.1 local:0.0.0.0 vrf:default ifname:vti-vti0] init -> up
Thu Oct 31 19:42:03 2019 daemon.err bfdd[3099]: libyang: Internal error (libyang-0.16-r3/src/plugins.c:641).
Thu Oct 31 19:42:06 2019 daemon.err bfdd[3099]: libyang: Internal error (libyang-0.16-r3/src/plugins.c:641).
Thu Oct 31 19:42:11 2019 daemon.err bfdd[3099]: libyang: Internal error (libyang-0.16-r3/src/plugins.c:641).
Thu Oct 31 19:44:44 2019 daemon.info bfdd[3099]: state-change: [mhop:no peer:192.168.202.1 local:0.0.0.0 vrf:default ifname:vti-vti0] up -> down reason:control-expired
Thu Oct 31 19:44:44 2019 daemon.notice ospfd[2764]: AdjChg: Nbr 192.168.4.10 on vti-vti0:192.168.202.2: Full -> Deleted (InactivityTimer)
Thu Oct 31 19:44:44 2019 daemon.info ospfd[2764]: nsm_change_state:(192.168.4.10, Full -> Deleted): scheduling new router-LSA origination
Thu Oct 31 19:44:50 2019 daemon.notice ospfd[2764]: AdjChg: Nbr 192.168.4.10 on vti-vti0:192.168.202.2: Down -> Init (PacketReceived)
Thu Oct 31 19:44:50 2019 daemon.notice ospfd[2764]: AdjChg: Nbr 192.168.4.10 on vti-vti0:192.168.202.2: Init -> ExStart (2-WayReceived)
Thu Oct 31 19:44:56 2019 daemon.info ospfd[2764]: Packet[DD]: Neighbor 192.168.4.10: Initial DBD from Slave, ignoring.
Thu Oct 31 19:45:00 2019 daemon.info ospfd[2764]: Packet[DD]: Neighbor 192.168.4.10 Negotiation done (Master).
Thu Oct 31 19:45:00 2019 daemon.notice ospfd[2764]: AdjChg: Nbr 192.168.4.10 on vti-vti0:192.168.202.2: ExStart -> Exchange (NegotiationDone)
Thu Oct 31 19:45:00 2019 daemon.notice ospfd[2764]: AdjChg: Nbr 192.168.4.10 on vti-vti0:192.168.202.2: Exchange -> Loading (ExchangeDone)
Thu Oct 31 19:45:00 2019 daemon.notice ospfd[2764]: AdjChg: Nbr 192.168.4.10 on vti-vti0:192.168.202.2: Loading -> Full (LoadingDone)
Thu Oct 31 19:45:00 2019 daemon.info ospfd[2764]: nsm_change_state:(192.168.4.10, Loading -> Full): scheduling new router-LSA origination

the libyang error is related to a patch in openwrt

don't know if this is a problem, setting an interval in frr, if I remember, would adjust the remote timers also, now I see that they are different

sh bfd peers
BFD Peers:
        peer 192.168.202.1 vrf default interface vti-vti0
                ID: 1096416105
                Remote ID: 2
                Status: up
                Uptime: 8 minute(s), 44 second(s)
                Diagnostics: ok
                Remote diagnostics: ok
                Local timers:
                        Receive interval: 600ms
                        Transmission interval: 600ms
                        Echo transmission interval: 50ms
                Remote timers:
                        Receive interval: 250ms
                        Transmission interval: 250ms
                        Echo transmission interval: 0ms
EasyNetDev commented 5 years ago

Hi,

Thanks for the patch! I will give a try also.

Kind regards, Adrian

rzalamena commented 5 years ago

The code to fix this issue was merged for master and stable/7.2. If there are still issues or notes, please let us know.

@polychaeta autoclose in 1 week