Open akosiaris opened 3 years ago
@akosiaris sorry for the delay on this! I think your request is 100% reasonable.
Like I said on the PR, we're moving away from env var config for driving this stuff. It's not really BGP configuration, but the BGPConfiguration or BGPPeer API seems like the closest fit for this sort of config. Alternatively maybe this goes on the Node API?
Also CC @neiljerram who I believe has looked at this before.
I'm happy to work with you to get that API change sorted out. It's a bit more work than the env var PR that you originally submitted. Once we agree the right spec, it would be a change to the API structs, and then confd to plumb the toggle through into the BIRD configuration.
Thanks @akosiaris and @caseydavenport for CCing me.
FWIW, in Calico Enterprise we unconditionally added this at the top level:
protocol bfd {
}
and then a BGPPeer
field to control adding bfd on;
into the config for each peering:
// Specifies whether and how to detect loss of connectivity on the peerings generated by
// this BGPPeer resource. Default value "None" means nothing beyond BGP's own (slow) hold
// timer. "BFDIfDirectlyConnected" means to use BFD when the peer is directly connected.
FailureDetectionMode FailureDetectionMode
But that misses two things that this PR proposes:
It would be nice if any support added by this PR is a compatible extension of what we already have in Enterprise.
is a compatible extension
To be clear, I mean in terms of the data model. (I'm sure we will be able to do any fix-ups that are needed to the related coding.)
@caseydavenport Any plans to have this in the open source version as well?
Hi!
At the Wikimedia Foundation, the non profit that operates Wikimedia, we 've been happy users of calico circa 2017. It's powering our kubernetes clusters quite nicely, so nicely that we never had up to now ask of anything. Well, here's a first request (which we are aiming to contribute code for)
Our setup isn't particularly complex, in fact the network part of it is described in detail at https://wikitech.wikimedia.org/wiki/Network_design (yes, we are transparent about how everything is setup) but the TL;DR is that we have 2 Core routers and we BGP peer our kubernetes nodes directly with them and only with them. So we don't do ToR BGP peering as our access switches - asw for short - don't have L3 functionality, nor do we do full or even partial mesh with the other nodes (kinda suboptimal, we know, it's a project for the future)
We 've recently started experimenting with announcing kubernetes service IPs and it works pretty nicely (including ECMP), so many thanks for that. We did however meet the following issue with simulating some failure modes.
When simulating the sudden total failure of a node (due e.g. to PSU issues, total motherboard failure etc) we noticed that it will take up to 4 minutes for BGP to converge and for the Core routers to pull the failed route from their FIB, effectively blackholing (part of - depending on whether ECMP is configured or not) traffic for the advertised addresses. This is kind of expected (to our understanding) as both sides have graceful restart configured (which is of course very much wanted) and the core routers use the GR timer for deciding when to retract the route from the FIB.
After discussing this internally we reached the conclusion that BFD would be the better solution to our problem (we already have other servers BGP peering using bird and having BFD configured mind you). I am spelling out our plan below in the rubric. We see that since projectcalico/bird#68 BFD should be supported.
Expected Behavior
When a kubernetes node using calico and BGP peering with an upstream router suddenly fails totally, BGP promptly converges and traffic is blackholed for a minimal amount of time.
Current Behavior
When a kubernetes node using calico and BGP peering with an upstream router suddenly fails totally, BGP traffic is blackholed for up to 4 minutes in our case, although this is dependent on the BGP configuration.
Possible Solution
Add something like the following (untested) in bird.cfg.template and bird6.cfg.template
and similarly if guard a
bfd yes;
statement in the bgp_templateI 'd be more than happy to submit a PR to the confd repo if this is a good way to go about this, but also happy to discuss alternatives.
Steps to Reproduce (for bugs)
Context
This issue gave us pause on adopting the kubernetes service IPs and external IPs functionality that calico is offering. We would like to use it to automate more our kubernetes service creation process which has some manual steps today.
Your Environment
3.17.1
kubernetes,
version1.16