Closed Dbzman closed 9 months ago
Which version of k8s are you using?
We use 1.22.14.
Could be the case that this is something that needs to be handled when doing the diff in gitops-engine, but I'm not familiar enough with SSA to say for sure. @leoluz?
(potentially related to #11139?)
@Dbzman Please inspect your Argo CD controller logs and see if you find an entry with this message:
error creating gvk parser: ...
If so, can you provide the full message in the log?
@leoluz We didn't see any of those errors. We configured the loglevel to info. Not sure if the error is supposed to show there.
What we further observed is that this doesn't happen consistently for all apps, but they all use the same api version. (batch/v1
)
We noticed a very strange behavior here. We saved the affected CronJob manifest locally, deleted it on Kubernetes and re-created it again. (so it's the exact same manifest, just re-created) After that, Argo was able to sync the application.
One thing is that those CronJobs were created with an older api version in the past, but we upgraded them to batch/v1
long ago and also in Kubernetes it shows as batch/v1
. Don't know why re-creation helps in that case.
We noticed a very strange behavior here. We saved the affected CronJob manifest locally, deleted it on Kubernetes and re-created it again. (so it's the exact same manifest, just re-created) After that, Argo was able to sync the application.
Thanks for the additional info. That actually makes sense. What is strange to me is that from your error message it seems that Argo CD is trying to convert from v1.CronJob
to v1beta1.CronJob
. Not sure why it is trying to go with an older version. That would only make sense if you are applying a CronJob with v1beta1.
I'll try to reproduce this error locally anyways.
Thanks for checking. Indeed, it's really weird that it tries to convert to an older version.
We had this issue on 60 of our 400 apps. Yesterday we fixed them all with the above mentioned workaround. Today all of those 60 apps show the error again. So it seems that it has nothing to do with old manifests that were upgraded.
@Dbzman just confirming.. Are the steps to reproduce still valid with your latest findings??
@leoluz I would say yes.
Using 2.5.1 version and having similar issues.
error calculating structured merge diff: error calculating diff: error while running updater.Apply: converting (v1beta1.PodDisruptionBudget) to (v1.PodDisruptionBudget): unknown conversion
and
error calculating structured merge diff: error calculating diff: error while running updater.Apply: converting (v2beta2.HorizontalPodAutoscaler) to (v1.HorizontalPodAutoscaler): unknown conversion
Same here with 2.5.2
:
error calculating structured merge diff: error calculating diff: error while running updater.Apply: converting (v2beta1.HorizontalPodAutoscaler) to (v1.HorizontalPodAutoscaler): unknown conversion
Same behavior with 2.5.2
:
ComparisonError: error calculating structured merge diff: error calculating diff: error while running updater.Apply: converting (v1.Ingress) to (v1beta1.Ingress): unknown conversion
adding Ingress in case someone hits the issue with that resource.
Just to provide some direction for users that might get into this error, the current workaround is disabling SSA in the failing resources by adding the annotation: argocd.argoproj.io/sync-options: ServerSideApply=false
.
For example, if the error is related to Ingress
conversion then add the annotation to your Ingress
resource.
Just to provide some direction for users that might get into this error, the current workaround is disabling SSA in the failing resources by adding the annotation:
argocd.argoproj.io/sync-options: ServerSideApply=false
. For example, if the error is related toIngress
conversion then add the annotation to yourIngress
resource.
Hi @leoluz, I add the annotation but it didn't work, still the same problem (HorizontalPodAutoscaler case)
fwiw the same occurs with cronJob
resources on 2.55 as well
We run into similar issues when enabling SSA for our apps. However, the issue isn't consistent between clusters/apps (the same app/resource might work on one but not the other).
What is strange to me is that from your error message it seems that Argo CD is trying to convert from v1.CronJob to v1beta1.CronJob. Not sure why it is trying to go with an older version. That would only make sense if you are applying a CronJob with v1beta1.
@leoluz I believe managedFields
are to blame. They include an apiVersion
field that might reference an older (beta) version.
It also explains why recreating works - it clears the managedFields
.
Sadly, it does not help me yet to resolve this issue without recreating the resources (I haven't found a way to clear/edit the managedFields).
I believe managedFields are to blame. They include an apiVersion field that might reference an older (beta) version.
This is not a might this is the definitive issue. 😓
@leoluz perhaps this will help: https://github.com/kubernetes/enhancements/blob/master/keps/sig-api-machinery/555-server-side-apply/README.md
Links that are useful in the readme are: https://github.com/kubernetes-sigs/structured-merge-diff/blob/master/merge/obsolete_versions_test.go
Do we why ArgoCD does not respect ".Capabilities.APIVersions", but use the "managedField" (supposed it is the reason, I don't know internally which component does this) as the way to decide which ApiGroup to use?
We are seeing this in 2.8 with HPA, clusterrole, clusterrolebinding and roles, on clusters that have all been properly upgraded and resource manifests updated but the clusters were created back when these beta api versions were k8s and are now removed.
We're seeing the same issue with ClusterRole, ClusterRoleBinding.
K8s docs notes that you can clear managed fields with a json patch. We've been employing that to get past this issue but this is really tiresome. Not sure if Argo can somehow handle it, which would be great. The errors in the ArgoCD sync panel aren't helpful enough because they don't tell us which resource had the conversion error.
k patch KIND NAME --type json -p '[{"op":"replace","path":"/metadata/managedFields","value":[{}]}]'
@msw-kialo fyi ^
ServerSide Diff feature is merged and available in Argo CD 2.10-RC1. If enabled, it should address this and other diff problems when ServerSide Apply is used.
I am closing this for now and feel free to reopen if the issue persists.
Ran into a similar issue failing to calculate diff for ClusterRole
"converting (v1.ClusterRole) to (v1beta1.ClusterRole):"
Enabling server side diff on the application resolved the issue for me.
We are using 2.10.5 and have this problem when we want to enable server side apply. I deleted all the hpas, but it didn't help.
ComparisonError: Failed to compare desired state to live state: failed to calculate diff: error calculating structured merge diff: error calculating diff: error while running updater.Apply: converting (v2.HorizontalPodAutoscaler) to (v1.HorizontalPodAutoscaler): unknown conversion
deleting the old HPAs and an old secret solve the issue in my case.
Checklist:
argocd version
.Describe the bug Using ServerSideApply, configured in an Application via Sync Options, fails with
Using it only with the "Sync" button, without having it configured for the app, works, though.
To Reproduce
batch/v1
or HPA with apiVersionautoscaling/v2beta2
synced without SSAExpected behavior ServerSideApply should work in both cases (app config + manual sync)
Screenshots Application configuration which breaks:
Using it only with the Sync button works:
Version