Open juliantaylor opened 3 years ago
The dangling blockaffinity had following content:
spec:
cidr: 100.70.221.192/26
deleted: "false"
node: kworker-be-prod-iz2-270
state: pending
@juliantaylor this looks to me like the apiserver connection failures happened in the middle of allocating a new block to that node. Allocation of a new block is a multi-step process, which looks something like this:
It looks like step 1 happened, but the API server issues likely happened immediately after, preventing the subsequent steps. This should be OK, in that the next time someone tries to allocate that block the state will be cleaned up.
IIRC, we shouldn't be advertising routes for pending affinities, based on this filter logic here: https://github.com/projectcalico/confd/blob/master/etc/calico/confd/templates/bird_aggr.cfg.template#L35-L45
However, it does look like we will program a local blackhole for the traffic.
When this happened, were you seeing the route advertised to other nodes? Or just the local blackhole? And, was it causing routing issues? I wouldn't expect it to actually impact traffic.
IIRC and from @juliantaylor 's description it was advertised via BGP (see 100.70.221.192/26 via xx.xx.199.27 dev tunl0 proto bird onlink
) in the routes of other nodes, and it was a local blackhole route on the affected node.
It did not cause routing issues, we observed that only in monitoring, since because of prior issues we monitor and compare the amount of configured routes and the amount of ipam blocks.
Sounds like we need to revisit and see why routes are being advertised for blockaffinities with state: pending
. I don't think that is working as intended.
Was there a workaround for this? I have a BlockAffinity that is causing this behavior, advertising a route that will blackhole. However, attempts to delete it notify me that it's Read-Only.
Although it also tells me it does not exist when I attempt to read this particular Affinity entry.
@Cojacfar are you sure your block affinities aren't stuck in deleted: true, state: pendingDeletion
?
The docs at https://docs.tigera.io/calico/latest/reference/resources/blockaffinity state:
deleted | When set to true, clients should treat this block as if it does not exist.
That might explain why you can list the affinity but not get it by name. At that point the question would be, why is it still pending - are there perhaps still pods running with IPs in this CIDR block?
There aren't any pods running with the IP range on the node sharing the route. Although that would make a lot of sense. I actually removed this node completely from the cluster, uninstalling RKE2, and re-added it thinking something similar about leftover addresses to share. However, the rule just got recreated by calico-node.
It is stuck pendingDeletion though, yes? And persists even after both deleting the node object from the cluster, and uninstalling the node completely?
the rule just got recreated by calico-node.
Did the blockAffinity disappear when the node was deleted, and come back after rejoining it? Or was it there the whole time? If the former, I wonder if there is perhaps a stale state file that is getting left behind on the node that persists across installations?
Hmm. I'm not sure if it disappeared when removed or not.
It is in pendingDeletion though, and it's actually noted in the BIRD file that it's awaiting deletion so it's blackholed.
protocol static {
# IP blocks for this host.
route 10.218.114.128/25 blackhole;
route 10.218.180.192/26 blackhole;
}
# Aggregation of routes on this host; export the block, nothing beneath it.
function calico_aggr ()
{
# Block 10.218.114.128/25 is confirmed
if ( net = 10.218.114.128/25 ) then { accept; }
if ( net ~ 10.218.114.128/25 ) then { reject; }
# Block 10.218.180.192/26 is pendingDeletion
}
@brandond Do you know of any way to force this deletion? I can't modify the resource directly as it both doesn't exist and/or is read-only.
I don't. I see that @caseydavenport picked this issue up - perhaps there are some suggestions from the Calico team?
There aren't any pods running with the IP range on the node sharing the route. Although that would make a lot of sense
One thing to note is that it might not be a pod, but the tunnel address of that node that is claiming the block.
After a large deployment we had a short kubernetes apiserver overload (causing connection failures) which was followed by an invalid route configured by calico. The invalid route had no corresponding ipamblock object but did have a blockaffinity object which did prevent the route from being removed. No pods where in the ip range of the route/blockaffinity. After manually deleting the block the route was removed automatically.
on the node of dangling blockaffinity the block was blackholed.
100-70-221-128-26 had a ipamblock and an affinity.
The calico-node, calico-controllermanager and calico-typha logs showed nothing interesting besides some connection cancelled messages some watches during the short apiserver outage. E.g. controller log
Restarting all calico components did not change anything.
We are not sure if this might have been caused by the apiserver outage or simply by some error in the ipamblock/affinity creation and subsequent deletion.
The cluster has 260 nodes, about 9000 running pods, 6 calico typha instances and 3 calico route reflectors.
Your Environment