Closed mdbooth closed 3 months ago
This issue is currently awaiting triage.
If CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted
label and provide further guidance.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
It breaks server-side apply. Applying a v1alpha6 cluster and then applying the same v1alpha6 cluster again without changes results in:
kubectl apply -f capo-cluster.yaml --server-side=true
secret/mbooth-v1alpha6-cloud-config serverside-applied
kubeadmconfigtemplate.bootstrap.cluster.x-k8s.io/mbooth-v1alpha6-md-0 serverside-applied
openstackcluster.infrastructure.cluster.x-k8s.io/mbooth-v1alpha6 serverside-applied
openstackmachinetemplate.infrastructure.cluster.x-k8s.io/mbooth-v1alpha6-control-plane serverside-applied
openstackmachinetemplate.infrastructure.cluster.x-k8s.io/mbooth-v1alpha6-md-0 serverside-applied
Apply failed with 1 conflict: conflict with "manager" using cluster.x-k8s.io/v1beta1: .spec.infrastructureRef
Please review the fields above--they currently have other managers. Here
are the ways you can resolve this warning:
* If you intend to manage all of these fields, please re-run the apply
command with the `--force-conflicts` flag.
* If you do not intend to manage all of the fields, please edit your
manifest to remove references to the fields that should keep their
current managers.
* You may co-own fields by updating your manifest to match the existing
value; in this case, you'll become the manager if the other manager(s)
stop managing the field (remove it from their configuration).
See https://kubernetes.io/docs/reference/using-api/server-side-apply/#conflicts
Apply failed with 1 conflict: conflict with "manager" using cluster.x-k8s.io/v1beta1: .spec.template.spec.infrastructureRef
Please review the fields above--they currently have other managers. Here
are the ways you can resolve this warning:
* If you intend to manage all of these fields, please re-run the apply
command with the `--force-conflicts` flag.
* If you do not intend to manage all of the fields, please edit your
manifest to remove references to the fields that should keep their
current managers.
* You may co-own fields by updating your manifest to match the existing
value; in this case, you'll become the manager if the other manager(s)
stop managing the field (remove it from their configuration).
See https://kubernetes.io/docs/reference/using-api/server-side-apply/#conflicts
Apply failed with 1 conflict: conflict with "manager" using controlplane.cluster.x-k8s.io/v1beta1: .spec.machineTemplate.infrastructureRef
Please review the fields above--they currently have other managers. Here
are the ways you can resolve this warning:
* If you intend to manage all of these fields, please re-run the apply
command with the `--force-conflicts` flag.
* If you do not intend to manage all of the fields, please edit your
manifest to remove references to the fields that should keep their
current managers.
* You may co-own fields by updating your manifest to match the existing
value; in this case, you'll become the manager if the other manager(s)
stop managing the field (remove it from their configuration).
See https://kubernetes.io/docs/reference/using-api/server-side-apply/#conflicts
I think this is a serious bug.
Paired with Fabrizio, just providing some information/context.
kubectl apply
as a user, you can set --force-conflicts
to overwrite the value, which is probably fine if you only apply your YAML manually once in a while.To summarize:
I disagree with with characterisation: this breaks clients using SSA without workarounds. Arguing that SSA itself is working as intended is working is pedantic: the behaviour of the controller breaks the expected semantics of a well-behaved object when updated using SSA. We should agree that the behaviour breaks SSA.
I accept that it might be impossible to fix in v1beta1. We can pass on existing workarounds to CAPO users when their tooling breaks because of this. Fixing in v1beta2 may be the pragmatic solution.
I think @sbueringer said that the field is not used. If this is the case, could we simply stop updating it in v1beta1, and remove it in v1beta2? That would fix the problem in both versions.
Sorry if we gave this impression, but we did not intend to be pedantic. As we usually do, we try to carve out some of our time to help because we care about this project and this community.
We should agree that the behavior breaks SSA
I still think that it is really important to clarify that reporting conflicts is the expected behavior of the SSA implementation in the API server, that there are no bugs or errors in the SSA implementation in the API server, and that SSA itself is not broken.
Why press on this point? not to be pedantic, but because if we keep saying that SSA is broken, someone could assume that the fix must be implemented in the API server, or might be in kubectl or client-go, or that this issue might impact other SSA users not using Cluster API, and ultimately lead to more confusion.
Instead, it is CAPI we have to look at, and more specifically the long-time existing CAPI behavior that keeps the version part of object references up to date.
And I think we all agree on this because also you are talking about fixing things in CAPI.
NOTE: I changed the title of the issue trying to get out of this quid-pro-quo.
I think @sbueringer said that the field is not used. If this is the case, could we simply stop updating it in v1beta1, and remove it in v1beta2? That would fix the problem in both versions.
Unfortunately, this is not possible, because there are users leveraging on references on Cluster API objects, and relying on the fact that those references will be always up to date (also clusterctl describe is doing this).
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
/priority important-longterm
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/close
We have https://github.com/kubernetes-sigs/cluster-api/issues/6539 to track redesign of references in CAPI (thus dropping API version, which caused this issue)
@fabriziopandini: Closing this issue.
What steps did you take and what happened?
I'm currently investigating a strange quirk of API conversion in CAPO. Our next release will introduce v1alpha7. I am testing the behaviour of v1alpha6. I notice that although I defined a MachineDeployment with:
Something has rewritten this as:
This works, but the results are subtly different to using the explicitly specified version.This breaks server-side apply, as well as creating machine objects which are subtly different. Specifically this will create a v1alpha7 OpenStackMachine from the converted v1alpha7 OpenStackMachineTemplate. This means that the object will not have a conversion annotation, and therefore lossy v1alpha6->v1alpha7->v1alpha6 transformations will not be restored.We have one of these in OpenStackMachine. In v1alpha6 the machinespec contains both
networks
andports
. In v1alpha7networks
are removed and they all becomeports
. This means that on down-conversion, without the conversion annotationnetworks
cannot be correctly restored because we don't know whichports
were previouslynetworks
.This means that when viewed as v1alpha6, machines created before and after upgrade by the same MachineDeployment are different.
Hopefully this doesn't matter too much.Assuming we have no bugs in our conversion logic the converted machine should operate the same as the old version, and external operators such as ArgoCD are more likely to be reconciling to be the template rather than the resulting machine. However it does open the door to edge cases, and creates an unexpected difference in behaviour across an upgrade.I also wonder how reconciliation tools (like Argo CD) handle the fact that the MachineDeployment spec changed without command.The serious issue is that the server-side changes to the Cluster, KubeadmControlPlane, and MachineDeployment specs break server-side apply. This will break any client tooling which relies on server-side apply. Specifically it breaks
kubectl apply ... --server-side=true
, and is likely to break third party tooling such as Argo CD.Incidentally, @lentzi90 found the code which changes it: https://github.com/kubernetes-sigs/cluster-api/blob/main/internal/controllers/machinedeployment/machinedeployment_controller.go#L413 and https://github.com/kubernetes-sigs/cluster-api/blob/main/util/conversion/conversion.go#L62
What did you expect to happen?
The infrastructureRef in the MachineDeployment should not be updated automatically. The user can reasonably do this when they update their actual templates to the new version. Having it updated automatically is Surprising.
Cluster API version
git HEAD, commit 022ccf1dd0c240652b115fdb9769cda70ad1be49 Approximately version 1.5.x
Kubernetes version
1.28.1
Anything else you would like to add?
No response
Label(s) to be applied
/kind bug /area upgrades /area api /area machinedeployment