Closed lukastribus closed 8 years ago
Hi Lukas and thank you for your interest to BLT. Regarding this issue, if you configure your BGP router with keepalive/holdtime 90/180, as soon as it speaks with BLT a negociation occurs, and you end up with a 10/30 working peer.
Real life example :
RP/0/0/CPU0:iosxrv-1#sho bgp neighbor
Thu May 12 16:24:32.593 UTC
BGP neighbor is 192.168.255.2 ==> this is BLT
Remote AS 1, local AS 1, internal link
Remote router ID 0.0.0.1
BGP state = Established, up for 01:13:35
Last read 00:00:05, Last read before reset 01:14:10
Hold time is 30, keepalive interval is 10 seconds
Configured hold time: 180, keepalive: 90, min acceptable hold time: 3
Could you elaborate the situation ending up to a dead peer ? I think I don't get it...
Hi Alain,
thank you for this project, I hope you don't mind me submitting a bunch of issues :)
BGP keeps going down to my XR box, with default BGP timers - or 90/180 as you proposed, because BLT doesn't send the second keepalive in time:
RP/0/RSP0/CPU0:PE-9#show run router bgp 65535 neighbor 10.7.250.6
Fri May 13 11:44:33.798 CEST
router bgp 65535
neighbor 10.7.250.6
remote-as 65535
timers 90 180
update-source Loopback0
session-open-mode passive-only
address-family link-state link-state
!
!
!
RP/0/RSP0/CPU0:PE-9#show bgp neighbor 10.7.250.6 | inc "old t"
Fri May 13 11:44:45.055 CEST
Hold time is 30, keepalive interval is 10 seconds
Configured hold time: 180, keepalive: 90, min acceptable hold time: 3
Last reset 00:00:46, due to BGP Notification sent: hold time expired
Error Code: hold time expired
RP/0/RSP0/CPU0:PE-9#
RP/0/RSP0/CPU0:PE-9#debug bgp keepalive 10.7.250.6 in
Fri May 13 11:46:07.030 CEST
RP/0/RSP0/CPU0:PE-9#
RP/0/RSP0/CPU0:May 13 11:46:21.176 : bgp[1058]: [default-iord]: KEEPALIVE received from 10.7.250.6
RP/0/RSP0/CPU0:May 13 11:46:21.176 : bgp[1058]: %ROUTING-BGP-5-ADJCHANGE : neighbor 10.7.250.6 Up (VRF: default) (AS: 65535)
RP/0/RSP0/CPU0:May 13 11:46:51.177 : bgp[1058]: %ROUTING-BGP-5-ADJCHANGE : neighbor 10.7.250.6 Down - BGP Notification sent, hold time expired (VRF: default) (AS: 65535)
RP/0/RSP0/CPU0:May 13 11:47:44.409 : bgp[1058]: [default-iord]: KEEPALIVE received from 10.7.250.6
RP/0/RSP0/CPU0:May 13 11:47:44.409 : bgp[1058]: %ROUTING-BGP-5-ADJCHANGE : neighbor 10.7.250.6 Up (VRF: default) (AS: 65535)
RP/0/RSP0/CPU0:May 13 11:48:14.410 : bgp[1058]: %ROUTING-BGP-5-ADJCHANGE : neighbor 10.7.250.6 Down - BGP Notification sent, hold time expired (VRF: default) (AS: 65535)
RP/0/RSP0/CPU0:May 13 11:48:45.249 : bgp[1058]: [default-iord]: KEEPALIVE received from 10.7.250.6
RP/0/RSP0/CPU0:May 13 11:48:45.249 : bgp[1058]: %ROUTING-BGP-5-ADJCHANGE : neighbor 10.7.250.6 Up (VRF: default) (AS: 65535)
RP/0/RSP0/CPU0:May 13 11:49:01.258 : bgp[1058]: [default-iord]: KEEPALIVE received from 10.7.250.6
RP/0/RSP0/CPU0:May 13 11:49:31.259 : bgp[1058]: %ROUTING-BGP-5-ADJCHANGE : neighbor 10.7.250.6 Down - BGP Notification sent, hold time expired (VRF: default) (AS: 65535)
RP/0/RSP0/CPU0:May 13 11:50:05.152 : bgp[1058]: [default-iord]: KEEPALIVE received from 10.7.250.6
RP/0/RSP0/CPU0:May 13 11:50:05.152 : bgp[1058]: %ROUTING-BGP-5-ADJCHANGE : neighbor 10.7.250.6 Up (VRF: default) (AS: 65535)
RP/0/RSP0/CPU0:May 13 11:50:35.153 : bgp[1058]: %ROUTING-BGP-5-ADJCHANGE : neighbor 10.7.250.6 Down - BGP Notification sent, hold time expired (VRF: default) (AS: 65535)
Some notes: BGP holdtime is negotiated, BGP keepalive interval is NOT. Every BGP implementation has to adjust BGP keepalive interval to be less than the holdtime and the RFC suggestion is for it to be a third of the holdtime.
The BLT behavior is that the keepalive interval is bigger than the holdtime (in this case the second keepalive is not transmitted), which is why the BGP session keeps dropping.
I can confirm this by tcpdumping on the VM where BLT runs.
Not sure about the difference between your setup and mine.
Strange because actual hold time is the minimum between HoldTime proposed by both peers.
/**
* get the negotiated hold time. This is the minimum of the locally configured hold time and the
* hold time value received from the remote peer in the initial open packet.
* It is assured that the negotiated hold time cannot be less than 3 seconds as specified by RFC4271.
*
* @return
*/
int getNegotiatedHoldTime() {
int negotiatedHoldTime = Math.min(peerConfiguration.getHoldTime(), peerProposedHoldTime);
if(negotiatedHoldTime < 3)
negotiatedHoldTime = 3;
return negotiatedHoldTime;
}
And keepalive becomes 1/3 of the negotiated HoldTime, whic complies to the RFC :
/**
* get the keepalive interval which is 1/3 of the negotiated hold time. It is assured that the interval
* cannot be less than 1 second as specified by RFC4271
*
* @return
*/
private int getSendKeepaliveTime() {
return Math.max(getNegotiatedHoldTime() / 3, 1);
}
Those excerpts come from BLT code, which was borrowed from an implementation written by Rainer Bienek. I work with an XRrouter just like you, and you can see in my previous post that, even though I have a local config at 90/180, I end up with 10/30, and BLT does send a keepalive every ten seconds :
22:26:50,476 INFO [QuartzScheduler_Worker-2] BGPv4Codec: writing packet onl.netfishers.blt.bgp.netty.protocol.KeepalivePacket@60676411[type=4]
2016-05-13 22:26:50,508 INFO [New I/O worker #1] BGPv4Codec: received packet onl.netfishers.blt.bgp.netty.protocol.KeepalivePacket@b0c846a[type=4]
22:27:00,478 INFO [QuartzScheduler_Worker-5] BGPv4Codec: writing packet onl.netfishers.blt.bgp.netty.protocol.KeepalivePacket@526febe8[type=4]
2016-05-13 22:27:00,528 INFO [New I/O worker #1] BGPv4Codec: received packet onl.netfishers.blt.bgp.netty.protocol.KeepalivePacket@a3a83ae[type=4]
22:27:10,481 INFO [QuartzScheduler_Worker-9] BGPv4Codec: writing packet onl.netfishers.blt.bgp.netty.protocol.KeepalivePacket@1e60c053[type=4]
2016-05-13 22:27:10,546 INFO [New I/O worker #1] BGPv4Codec: received packet onl.netfishers.blt.bgp.netty.protocol.KeepalivePacket@d00a20d[type=4]
and session remains very stable :
RP/0/0/CPU0:xrv1#sho bgp link-state link-state summary
Fri May 13 21:35:37.960 UTC
BGP router identifier 10.0.0.29, local AS number 1
BGP generic scan interval 60 secs
Non-stop routing is enabled
BGP table state: Active
Table ID: 0x0 RD version: 54
BGP main routing table version 54
BGP NSR Initial initsync version 54 (Reached)
BGP NSR/ISSU Sync-Group versions 0/0
BGP scan interval 60 secs
BGP is operating in STANDALONE mode.
Process RcvTblVer bRIB/RIB LabelVer ImportVer SendTblVer StandbyVer
Speaker 54 54 54 54 54 0
Neighbor Spk AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down St/PfxRcd
192.168.2.126 0 1 68 89 54 0 0 00:11:08 0
RP/0/0/CPU0:xrv1#
Haven't you changed something in the code ? or, more probably, haven't you got errors causing your peer to go up and down ? You can follow this by increasing the level of verbosity in 'blt.conf', turn it to 'INFO' might be sufficient.
Let me know.
This is vanilla blt-0.4.4 release, I did not change anything in the code.
From the logfile I can clearly see that blt does not send keepalive messages in time, for example, in the following case with 10/30 (10 seconds keepalive interval, 30 seconds holdtime), blt sends the first keepalive correctly at 12:15:57, the second keepalive very late at 12:16:24 (its 27 seconds after the first keepalive) and the third keepalive is not within 30 seconds, which is the reason why the ASR9k tears down the session:
RP/0/RSP0/CPU0:May 17 12:15:57.254 : bgp[1058]: %ROUTING-BGP-5-ADJCHANGE : neighbor 10.7.250.6 Up (VRF: default) (AS: 65535)
RP/0/RSP0/CPU0:BZ-PE-9#
RP/0/RSP0/CPU0:BZ-PE-9#show bgp neighbor 10.7.250.6
Tue May 17 12:16:09.707 CEST
BGP neighbor is 10.7.250.6
Remote AS 65535, local AS 65535, internal link
Remote router ID 0.0.0.1
BGP state = Established, up for 00:00:12
NSR State: None
TCP open mode: passive only
Last read 00:00:12, Last read before reset 3d19h
Hold time is 30, keepalive interval is 10 seconds
Configured hold time: 180, keepalive: 60, min acceptable hold time: 3
Last write 00:00:07, attempted 7113, written 7113
Second last write 00:00:07, attempted 15765, written 8652
Last write before reset 3d19h, attempted 19, written 19
Second last write before reset 3d19h, attempted 19, written 19
Last write pulse rcvd May 17 12:16:02.314 last full May 17 12:16:02.271 pulse count 25276
Last write pulse rcvd before reset 3d19h
Socket not armed for io, armed for read, armed for write
Last write thread event before reset 3d19h, second last 3d19h
Last KA expiry before reset 3d19h, second last 3d19h
Last KA error before reset 00:00:00, KA not sent 00:00:00
Last KA start before reset 3d19h, second last 3d19h
Precedence: internet
Non-stop routing is enabled
Multi-protocol capability received
Neighbor capabilities:
Route refresh: advertised (old + new)
4-byte AS: advertised
Address family Link-state Link-state: advertised and received
Received 4275 messages, 0 notifications, 0 in queue
Sent 301547 messages, 1245 notifications, 0 in queue
Minimum time between advertisement runs is 0 secs
Inbound message logging enabled, 3 messages buffered
Outbound message logging enabled, 3 messages buffered
For Address Family: Link-state Link-state
BGP neighbor version 516
Update group: 0.1 Filter-group: 0.1 No Refresh request being processed
Route refresh request: received 0, sent 0
0 accepted prefixes, 0 are bestpaths
Cumulative no. of prefixes denied: 0.
Prefix advertised 458, suppressed 0, withdrawn 0
Maximum prefixes allowed 131072
Threshold for warning message 75%, restart interval 0 min
AIGP is enabled
An EoR was not received during read-only mode
Last ack version 0, Last synced ack version 0
Outstanding version objects: current 1, max 2
Additional-paths operation: None
Send Multicast Attributes
Connections established 1255; dropped 1254
Local host: 10.7.1.28, Local port: 179, IF Handle: 0x00000000
Foreign host: 10.7.250.6, Foreign port: 41323
Last reset 3d19h, due to Peer closing down the session
Peer reset reason: Remote closed the session (Connection timed out)
Time since last notification sent to neighbor: 3d19h
Error Code: hold time expired
Notification data sent:
None
RP/0/RSP0/CPU0:BZ-PE-9#
RP/0/RSP0/CPU0:May 17 12:16:55.174 : bgp[1058]: %ROUTING-BGP-5-ADJCHANGE : neighbor 10.7.250.6 Down - BGP Notification sent, hold time expired (VRF: default) (AS: 65535)
RP/0/RSP0/CPU0:BZ-PE-9#
root@igpmon01:/var/log# grep -E "writing.+KeepalivePacket" blt.log
12:15:57.242 [New I/O worker #1] INFO o.n.b.bgp.netty.handlers.BGPv4Codec - writing packet onl.netfishers.blt.bgp.netty.protocol.KeepalivePacket@7a7d8351[type=4]
2016-05-17 12:15:57,242 INFO [New I/O worker #1] BGPv4Codec: writing packet onl.netfishers.blt.bgp.netty.protocol.KeepalivePacket@7a7d8351[type=4]
12:16:24.698 [QuartzScheduler_Worker-8] INFO o.n.b.bgp.netty.handlers.BGPv4Codec - writing packet onl.netfishers.blt.bgp.netty.protocol.KeepalivePacket@2e10ecd3[type=4]
2016-05-17 12:16:24,698 INFO [QuartzScheduler_Worker-8] BGPv4Codec: writing packet onl.netfishers.blt.bgp.netty.protocol.KeepalivePacket@2e10ecd3[type=4]
root@igpmon01:/var/log#
I think blt is busy doing some other things and the keepalive handling is blocked or delayed for that reason. In the log I see a lot of activity like this:
12:16:06.054 [QuartzScheduler_Worker-5] INFO o.n.blt.snmp.SnmpPollingTask - Task 'SNMP Polling task for router Router BLABLA-IPE-2 (router-id 0100.0700.8011), creation time 1463480164979.' cancelled, because router Router BLABLA-IPE-2 (router-id 0100.0700.8011) is lost.
2016-05-17 12:16:06,054 INFO [QuartzScheduler_Worker-5] SnmpPollingTask: Task 'SNMP Polling task for router Router BLABLA-IPE-2 (router-id 0100.0700.8011), creation time 1463480164979.' cancelled, because router Router BLABLA-IPE-2 (router-id 0100.0700.8011) is lost.
[...]
2016-05-17 12:16:06,155 INFO [New I/O worker #1] Task: Scheduling the task '' in 1000ms.
12:16:06.155 [New I/O worker #1] INFO onl.netfishers.blt.tasks.Task - Scheduling the task '' in 1000ms.
2016-05-17 12:16:06,155 INFO [New I/O worker #1] Task: Scheduling the task '' in 1000ms.
12:16:06.155 [New I/O worker #1] INFO onl.netfishers.blt.tasks.Task - Scheduling the task '' in 1000ms.
2016-05-17 12:16:06,155 INFO [New I/O worker #1] Task: Scheduling the task '' in 1000ms.
12:16:06.155 [New I/O worker #1] INFO onl.netfishers.blt.tasks.Task - Scheduling the task '' in 1000ms.
2016-05-17 12:16:06,155 INFO [New I/O worker #1] Task: Scheduling the task '' in 1000ms.
[...]
12:16:08.761 [QuartzScheduler_Worker-4] INFO o.n.blt.snmp.SnmpPollingTask - Task 'SNMP Polling task for router Router BLABLA-P-1 (router-id 0100.0700.2001), creation time 1463480165384.' cancelled because a similar task has run meanwhile.
2016-05-17 12:16:08,761 INFO [QuartzScheduler_Worker-4] SnmpPollingTask: Task 'SNMP Polling task for router Router BLABLA-P-1 (router-id 0100.0700.2001), creation time 1463480165384.' cancelled because a similar task has run meanwhile.
I would like to send you the full, non-redacted log file privately, do you mind if I send it to your work address (I assume amoretti at the company at 170 West Tasman Drive)?
This is a production network with about 50 nodes and an average number of links such a network has. I assume you have similar or bigger deployments with blt?
Hi Lukas, thank you for the investigation. We do have deployments with huge amount of messages (nodes+links+prefixes), and 50 nodes should work.
Yes you can send me a private msg. A complete logfile at INFO level could help us to tackle this issue.
One more thing : It seems you have some routers in 'lost' state. Could you please clean your setup and try again:
Close this issue since the unexpected behaviour is due to heavy SNMP polling. I am opening another issue to follow this problem.
BGP Hold Time in BLT is 30 seconds, but keepalive message are only sent every 60 seconds. This causes the BGP session to keep going down.
As per RFC4271 section 4.4, the keepalive interval should be one third of the Hold Time interval.
If the keepalive interval is bigger than hold time, the BGP session cannot possibly be stable.
It would probably best to bump the hold time to 180 seconds.