Open tmmorin opened 2 hours ago
This issue is currently awaiting triage.
If Metal3.io contributors determine this is a relevant issue, they will accept it by applying the triage/accepted
label and provide further guidance.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
/cc @maxrantil @lentzi90 @furkatgofurov7 @hardys
What steps did you take and what happened
When deleting all resources for a capm3 cluster a Metal3DataTemplate resource remains pending deletion, its
metal3datatemplate.infrastructure.cluster.x-k8s.io
finalizer never gets cleared.What did you expect to happen:
No resource should have remained.
TL;DR, me speculating on the cause
Disclaimer: my familiarity with capm3 is low, please take what follows as humble hypothesis ;)
The Metal3DataTemplate controller reconcileDelete does not seem able to take into account that a Metal3DataClaim is not here anymore. More specifically DataTemplateManager.UpdateDatas blindly assumes that all Metal3DataClaims reported to exist by Metal3Data resource, actually exist.
Context
In the context of the Sylva project CI pipelines, since we upgraded to capm3 1.8.1, we've been regularly observing a case where a
Metal3DataTemplate
remains stuck pending deletion because itsmetal3datatemplate.infrastructure.cluster.x-k8s.io
finalizer is never cleared (for reference: our related gitlab issue)The context is a workload cluster deletion scenario: we first delete the resources of the cluster, CAPI Cluster object first, then once the Cluster resource is deleted, we delete all the CAPI resources that our system defines (KubeadmControlPlane, MDs, all the XXX template resources, etc.) and then the namespace.
(I realized when digging this issue that we were deleting the Machine3DataTemplate too early, possibly before the Metal3Machines using them would be deleted, so I suspect that this is possibly contributing to triggering this issue. We're aware that we possibly need to wait and delete resources in a more orderly way...)
I digged into one specific occurrence, and I'm sharing here my observations.
Observations
The observations shared are based on the dumps of resources and logs that we take after a CI job failure.
All the data can be found here: https://gitlab.com/sylva-projects/sylva-core/-/jobs/7852801513. The interesting bits are under
management-cluster-dump
which has aclusterctl-describe.txt
,Nodes.summary.txt
,Metal3*
files with resource dumps (the relevant one are innamespace: kubeadm-capm3-virt
, andcapm3-system
directory has capm3 controller logs). I'm directly attaching the Metal3DataTemplate resource dump and the capm3 controller logs to this issue.At the point where the dump is taken we see the following:
The
status.index
below is interesting, it's related to why the finalizer can't be removed. It designates a Meta3DataClaim resource (wc-1458191268-kubeadm-capm3-virt-control-plane-fn667) that is deleted very early.deletionTimestamp: "2024-09-18T09:53:32Z"
Grepping the logs of capm3 controllers :
(no log after
09:51:51
for our Metal3DataTemplate)The above shows among other things that the Metal3DataClaims where deleted quite early (09:47:51)
• on related resources (same prefix)
This shows that:
Tentative analysis
I've been looking at the code to understand why the Metal3DataTemplate reconciliation can have been unable to remove the finalizer, and unable to remove the non-existing Metal3DataClaim from the status.indexes.
I'm trying to infer what the code would have seen between 09:47:51 (when the deletionTimestamp was set on Metal3DataTemplate, and when the Metal3DataClaims were deleted) and the deletion of the Metal3Data (which happened after 09:53:40).
So, in this windows:
The code in DataTemplateManager.UpdateDatas seems to:
map[0: wc-1458191268-kubeadm-capm3-virt-control-plane-fn667]
indexes, err = m.updateData(ctx, &dataClaim, indexes
to update the indexesindexes, err = m.updateData(ctx, &dataClaim, indexes)
This is consistent with the observation that the allocationsCount seen by reconcileDelete is 1, that the finalizer is hence not removed, and the status.indexes we see on the Metal3DataTemplate resource.
It seems that UpdateDatas should check that the Metal3DataClaims names returned by getIndexes actually exist, and remove from 'indexes' the ones that are not.
I would speculate that #1478 would have made this latent issue occur more frequently. Indeed #1478 "Before cleaning up Metal3Data, check that the Metal3DataClaim is gone", merged in the 1.8.x development timeframe, would possibly create the problematic condition: Metal3DataClaim gone, but Metal3Data still points to it.
Worth noting
The logs show that there are various issues occurring at differents points. I haven't digged those specifically, with the idea that despite transient errors, the processing should still be able to converge, and also because I suspect that the way Sylva tears down CAPI and CAPI providers resources may not be orderly enough and might cause these issues.
Environment:
kubectl version
): 1.28.12/kind bug