Open rumstead opened 10 months ago
It deleted every single resource? I thought there was a failsafe which would block syncs if every resource was marked to be pruned.
It deleted every single resource? I thought there was a failsafe which would block syncs if every resource was marked to be pruned.
it deleted every resource :-/ deployed by that application and then the finalizers took over downstream.
I feel like the only real hint from the logs is all the obj->nil
messages.
Agreed, from my understanding of the code:
Local manifests support the argocd app sync --local
feature, i.e. temporary overrides from a local manifest source. Probably not relevant in this case.
Feels like the problem must be similar to https://github.com/argoproj/argo-cd/issues/2573 - some error being ignored silently.
The resources deployed are a CRD that we created. I ran them through the code a few times without any issues unmarshaling and after the hard refresh it worked. It didn't like something :-/.
Do you happen to know if there is a code to safeguard against pruning everything?
The CRDs are cluster scoped
❯ k api-resources --api-group=internal.com
NAME SHORTNAMES APIVERSION NAMESPACED KIND
tenants tn,tnt internal.com/v1alpha1 false Tenant
one thing we also noticed.
In the prune case, the sync tasks look like this (ie, a missing namespace between the objects)
Sync/0 resource internal.com/Tenant:/app-ui obj->nil
After the hard refresh, they look like this
Sync/0 resource internal.com/Tenant:namespace/app-ui obj->obj
Hm. Did you make changes to the CRD recently to move from namespaced to non-namespaced?
Figuring that the prune happened against revision 9b9ff97b11d83373bdd4d05fadcf20c791e5600e
, have you had a look at repository server logs? Have there been some errors recorded for rendering this particular revision?
Hm. Did you make changes to the CRD recently to move from namespaced to non-namespaced?
Figuring that the prune happened against revision
9b9ff97b11d83373bdd4d05fadcf20c791e5600e
, have you had a look at repository server logs? Have there been some errors recorded for rendering this particular revision?
No recent changes in the CRD, it has always been cluster scoped but the Argo CD Application that deploys the Tenant
resources has a namespace defined in it (though it always has). No errors in the repo server logs against that commit :(.
The only logs I have tracked own in the repo server are the GenerateManfiest
grpc call from getRepoObjs
and the subsequent git checkout --force
.
time="2023-08-12T17:25:47Z" level=info msg="manifest cache miss: &ApplicationSource{RepoURL:https://...,Path:apps/,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:&ApplicationSourceDirectory{Recurse:true,Jsonnet:ApplicationSourceJsonnet{ExtVars:[]JsonnetVar{},TLAs:[]JsonnetVar{},Libs:[],},Exclude:,Include:,},Plugin:nil,Chart:,}/9b9ff97b11d83373bdd4d05fadcf20c791e5600e"
and then further cache hits
time="2023-08-12T17:25:54Z" level=info msg="manifest cache hit: &ApplicationSource{RepoURL:https://...,Path:apps/,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:&ApplicationSourceDirectory{Recurse:true,Jsonnet:ApplicationSourceJsonnet{ExtVars:[]JsonnetVar{},TLAs:[]JsonnetVar{},Libs:[],},Exclude:,Include:,},Plugin:nil,Chart:,}/9b9ff97b11d83373bdd4d05fadcf20c791e5600e"
EDIT: since targetObj was nil, it uses liveObj. Since liveObj doesn't have the namespace (its cluster scoped) no namespace is used
EDIT2: Prune does indeed work... (duh!)
We have tried relentlessly to reproduce this and haven't been able to. I am going to close it and will reopen if we get more details. Thanks for everyone's suggestions. Hopefully, we don't see this again.
Hi,
sorry to reopen this issue but we have experimented the same behaviour. In the logs we see that application-controller ends a Reconciliation in both pods (we have HA) without changes:
Log | PodName |
---|---|
time="2024-04-23T11:13:09Z" level=info msg="Comparing app state (cluster: https://kubernetes.default.svc, namespace: sdk-builder)" application=argocd/sdk-builder | argocd-application-controller-0 |
time="2024-04-23T11:13:09Z" level=info msg="Refreshing app status (controller refresh requested), level (1)" application=argocd/sdk-builder | argocd-application-controller-0 |
time="2024-04-23T11:13:09Z" level=info msg="Refreshing app status (controller refresh requested), level (1)" application=argocd/sdk-builder | argocd-application-controller-1 |
time="2024-04-23T11:13:09Z" level=info msg="Comparing app state (cluster: https://kubernetes.default.svc, namespace: sdk-builder)" application=argocd/sdk-builder | argocd-application-controller-1 |
time="2024-04-23T11:13:09Z" level=info msg="manifest cache hit: &ApplicationSource{RepoURL:git@github.com:MyGitAccount/myNiceRepo.git,Path:./,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:nil,Plugin:&ApplicationSourcePlugin{Name:DecryptFromSopsForPreviewEnv,Env:[]*EnvEntry{},Parameters:[]ApplicationSourcePluginParameter{},},Chart:,Ref:,}/1e189c27189403969bd31706760e7a43c2fc833a" | argocd-repo-server-667cd4fb45-89pb2 |
time="2024-04-23T11:13:09Z" level=info msg="manifest cache hit: &ApplicationSource{RepoURL:git@github.com:MyGitAccount/myNiceRepo.git,Path:./,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:nil,Plugin:&ApplicationSourcePlugin{Name:DecryptFromSopsForPreviewEnv,Env:[]*EnvEntry{},Parameters:[]ApplicationSourcePluginParameter{},},Chart:,Ref:,}/1e189c27189403969bd31706760e7a43c2fc833a" | argocd-repo-server-667cd4fb45-89pb2 |
time="2024-04-23T11:13:09Z" level=info msg="getRepoObjs stats" application=argocd/sdk-builder build_options_ms=0 helm_ms=0 plugins_ms=0 repo_ms=0 time_ms=11 unmarshal_ms=10 version_ms=0 | argocd-application-controller-1 |
time="2024-04-23T11:13:09Z" level=info msg="getRepoObjs stats" application=argocd/sdk-builder build_options_ms=0 helm_ms=8 plugins_ms=0 repo_ms=0 time_ms=16 unmarshal_ms=8 version_ms=0 | argocd-application-controller-0 |
time="2024-04-23T11:13:09Z" level=info msg="Skipping auto-sync: application status is Synced" application=argocd/sdk-builder | argocd-application-controller-0 |
time="2024-04-23T11:13:09Z" level=info msg="Skipping auto-sync: application status is Synced" application=argocd/sdk-builder | argocd-application-controller-1 |
time="2024-04-23T11:13:09Z" level=info msg="No status changes. Skipping patch" application=argocd/sdk-builder | argocd-application-controller-0 |
time="2024-04-23T11:13:09Z" level=info msg="Reconciliation completed" application=argocd/sdk-builder dedup_ms=0 dest-name= dest-namespace=sdk-builder dest-server="https://kubernetes.default.svc" diff_ms=35 fields.level=1 git_ms=16 health_ms=2 live_ms=9 settings_ms=0 sync_ms=0 time_ms=92 | argocd-application-controller-0 |
time="2024-04-23T11:13:09Z" level=info msg="Reconciliation completed" application=argocd/sdk-builder dedup_ms=0 dest-name= dest-namespace=sdk-builder dest-server="https://kubernetes.default.svc" diff_ms=43 fields.level=1 git_ms=11 health_ms=2 live_ms=5 settings_ms=0 sync_ms=0 time_ms=91 | argocd-application-controller-1 |
time="2024-04-23T11:13:09Z" level=info msg="No status changes. Skipping patch" application=argocd/sdk-builder | argocd-application-controller-1 |
After just 10 seconds, both application servers starts again the resync process to the same git commit (1e189c27189403969bd31706760e7a43c2fc833a). In that case, the repo server reports a cache miss and starts a new sync which starts to prune resources:
Log | PodName |
---|---|
time="2024-04-23T11:13:14Z" level=info msg="Start processing" resource=argocd/sdk-builder | argocd-notifications-controller-65cffbdbdd-qh8fk |
time="2024-04-23T11:13:14Z" level=info msg="Processing completed" resource=argocd/sdk-builder | argocd-notifications-controller-65cffbdbdd-qh8fk |
time="2024-04-23T11:13:24Z" level=info msg="Comparing app state (cluster: https://kubernetes.default.svc, namespace: sdk-builder)" application=argocd/sdk-builder | argocd-application-controller-0 |
time="2024-04-23T11:13:24Z" level=info msg="Refreshing app status (controller refresh requested), level (1)" application=argocd/sdk-builder | argocd-application-controller-0 |
time="2024-04-23T11:13:24Z" level=info msg="manifest cache miss: &ApplicationSource{RepoURL:git@github.com:MyGitAccount/myNiceRepo.git,Path:./,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:nil,Plugin:&ApplicationSourcePlugin{Name:DecryptFromSopsForPreviewEnv,Env:[]*EnvEntry{},Parameters:[]ApplicationSourcePluginParameter{},},Chart:,Ref:,}/1e189c27189403969bd31706760e7a43c2fc833a" | argocd-repo-server-667cd4fb45-89pb2 |
time="2024-04-23T11:13:24Z" level=info msg="Refreshing app status (controller refresh requested), level (1)" application=argocd/sdk-builder | argocd-application-controller-1 |
time="2024-04-23T11:13:24Z" level=info msg="Comparing app state (cluster: https://kubernetes.default.svc, namespace: sdk-builder)" application=argocd/sdk-builder | argocd-application-controller-1 |
time="2024-04-23T11:13:24Z" level=info msg="manifest cache miss: &ApplicationSource{RepoURL:git@github.com:MyGitAccount/myNiceRepo.git,Path:./,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:nil,Plugin:&ApplicationSourcePlugin{Name:DecryptFromSopsForPreviewEnv,Env:[]*EnvEntry{},Parameters:[]ApplicationSourcePluginParameter{},},Chart:,Ref:,}/1e189c27189403969bd31706760e7a43c2fc833a" | argocd-repo-server-667cd4fb45-89pb2 |
time="2024-04-23T11:13:26Z" level=info msg="manifest cache miss: &ApplicationSource{RepoURL:git@github.com:MyGitAccount/myNiceRepo.git,Path:./,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:nil,Plugin:&ApplicationSourcePlugin{Name:DecryptFromSopsForPreviewEnv,Env:[]*EnvEntry{},Parameters:[]ApplicationSourcePluginParameter{},},Chart:,Ref:,}/1e189c27189403969bd31706760e7a43c2fc833a" | argocd-repo-server-667cd4fb45-89pb2 |
time="2024-04-23T11:13:26Z" level=info msg="manifest cache miss: &ApplicationSource{RepoURL:git@github.com:MyGitAccount/myNiceRepo.git,Path:./,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:nil,Plugin:&ApplicationSourcePlugin{Name:DecryptFromSopsForPreviewEnv,Env:[]*EnvEntry{},Parameters:[]ApplicationSourcePluginParameter{},},Chart:,Ref:,}/1e189c27189403969bd31706760e7a43c2fc833a" | argocd-repo-server-667cd4fb45-89pb2 |
time="2024-04-23T11:13:28Z" level=info msg="getRepoObjs stats" application=argocd/sdk-builder build_options_ms=0 helm_ms=0 plugins_ms=0 repo_ms=0 time_ms=3635 unmarshal_ms=3635 version_ms=0 | argocd-application-controller-0 |
time="2024-04-23T11:13:28Z" level=info msg="Start processing" resource=argocd/sdk-builder | argocd-notifications-controller-65cffbdbdd-qh8fk |
time="2024-04-23T11:13:28Z" level=info msg="Processing completed" resource=argocd/sdk-builder | argocd-notifications-controller-65cffbdbdd-qh8fk |
time="2024-04-23T11:13:28Z" level=info msg="Initiated automated sync to '1e189c27189403969bd31706760e7a43c2fc833a'" application=sdk-builder dest-namespace=sdk-builder dest-server="https://kubernetes.default.svc" reason=OperationStarted type=Normal | argocd-application-controller-0 |
time="2024-04-23T11:13:28Z" level=info msg="Initiated automated sync to '1e189c27189403969bd31706760e7a43c2fc833a'" application=argocd/sdk-builder | argocd-application-controller-0 |
time="2024-04-23T11:13:28Z" level=info msg="Updated sync status: Synced -> OutOfSync" application=sdk-builder dest-namespace=sdk-builder dest-server="https://kubernetes.default.svc" reason=ResourceUpdated type=Normal | argocd-application-controller-0 |
time="2024-04-23T11:13:28Z" level=info msg="Processing completed" resource=argocd/sdk-builder | argocd-notifications-controller-65cffbdbdd-qh8fk |
time="2024-04-23T11:13:28Z" level=info msg="Start processing" resource=argocd/sdk-builder | argocd-notifications-controller-65cffbdbdd-qh8fk |
time="2024-04-23T11:13:28Z" level=info msg="Initialized new operation: {&SyncOperation{Revision:1e189c27189403969bd31706760e7a43c2fc833a,Prune:true,DryRun:false,SyncStrategy:nil,Resources:[]SyncOperationResource{SyncOperationResource{Group:,Kind:ServiceAccount,Name:sdk-builder,Namespace:,},SyncOperationResource{Group:cert-manager.io,Kind:Certificate,Name:sdk-builder-mrfio-https,Namespace:,},SyncOperationResource{Group:apps,Kind:Deployment,Name:sdk-builder,Namespace:,},SyncOperationResource{Group:rbac.authorization.k8s.io,Kind:RoleBinding,Name:sdk-builder-rolebinding,Namespace:,},SyncOperationResource{Group:networking.istio.io,Kind:Gateway,Name:sdk-builder-cluster,Namespace:,},SyncOperationResource{Group:networking.istio.io,Kind:VirtualService,Name:sdk-builder-leaderelection,Namespace:,},SyncOperationResource{Group:networking.istio.io,Kind:VirtualService,Name:sdk-builder-vs,Namespace:,},SyncOperationResource{Group:,Kind:Secret,Name:sdk-builder-env,Namespace:,},SyncOperationResource{Group:networking.istio.io,Kind:Gateway,Name:sdkmrfio-gw,Namespace:,},SyncOperationResource{Group:policy,Kind:PodDisruptionBudget,Name:sdk-builder-pdb,Namespace:,},SyncOperationResource{Group:,Kind:Namespace,Name:sdk-builder,Namespace:,},SyncOperationResource{Group:cert-manager.io,Kind:Certificate,Name:sdk-builder-cluster-https,Namespace:,},SyncOperationResource{Group:networking.istio.io,Kind:Gateway,Name:sdk-builder-mrfio,Namespace:,},SyncOperationResource{Group:networking.istio.io,Kind:ServiceEntry,Name:sdk-mrf-io-cloudfront,Namespace:,},SyncOperationResource{Group:networking.istio.io,Kind:Gateway,Name:sdk-builder-leaderelection,Namespace:,},SyncOperationResource{Group:networking.istio.io,Kind:DestinationRule,Name:sdk-mrfio-dest,Namespace:,},SyncOperationResource{Group:monitoring.coreos.com,Kind:ServiceMonitor,Name:appmetrics-sdk-builder,Namespace:,},SyncOperationResource{Group:networking.istio.io,Kind:DestinationRule,Name:sdk-builder,Namespace:,},SyncOperationResource{Group:rbac.authorization.k8s.io,Kind:Role,Name:sdk-builder-role,Namespace:,},SyncOperationResource{Group:cert-manager.io,Kind:Certificate,Name:sdk-builder.leaderelection.mrf.io,Namespace:,},SyncOperationResource{Group:cert-manager.io,Kind:Certificate,Name:sdkmrfio-https,Namespace:,},SyncOperationResource{Group:,Kind:Service,Name:sdk-builder,Namespace:,},SyncOperationResource{Group:autoscaling,Kind:HorizontalPodAutoscaler,Name:sdk-builder-autoscaler,Namespace:,},SyncOperationResource{Group:security.istio.io,Kind:AuthorizationPolicy,Name:sdk-builder,Namespace:,},},Source:nil,Manifests:[],SyncOptions:[ServerSideApply=true],Sources:[]ApplicationSource{},Revisions:[],} { true} [] {5 nil}}" application=argocd/sdk-builder | argocd-application-controller-1 |
time="2024-04-23T11:13:28Z" level=info msg="Comparing app state (cluster: https://kubernetes.default.svc, namespace: sdk-builder)" application=argocd/sdk-builder | argocd-application-controller-1 |
time="2024-04-23T11:13:28Z" level=info msg="updated 'argocd/sdk-builder' operation (phase: Running)" | argocd-application-controller-1 |
time="2024-04-23T11:13:28Z" level=info msg="manifest cache hit: &ApplicationSource{RepoURL:git@github.com:MyGitAccount/myNiceRepo.git,Path:./,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:nil,Plugin:&ApplicationSourcePlugin{Name:DecryptFromSopsForPreviewEnv,Env:[]*EnvEntry{},Parameters:[]ApplicationSourcePluginParameter{},},Chart:,Ref:,}/1e189c27189403969bd31706760e7a43c2fc833a" | argocd-repo-server-667cd4fb45-6q492 |
time="2024-04-23T11:13:28Z" level=info msg="getRepoObjs stats" application=argocd/sdk-builder build_options_ms=0 helm_ms=0 plugins_ms=0 repo_ms=0 time_ms=7 unmarshal_ms=7 version_ms=0 | argocd-application-controller-1 |
time="2024-04-23T11:13:28Z" level=info msg="Initialized new operation: {&SyncOperation{Revision:1e189c27189403969bd31706760e7a43c2fc833a,Prune:true,DryRun:false,SyncStrategy:nil,Resources:[]SyncOperationResource{SyncOperationResource{Group:,Kind:ServiceAccount,Name:sdk-builder,Namespace:,},SyncOperationResource{Group:cert-manager.io,Kind:Certificate,Name:sdk-builder-mrfio-https,Namespace:,},SyncOperationResource{Group:apps,Kind:Deployment,Name:sdk-builder,Namespace:,},SyncOperationResource{Group:rbac.authorization.k8s.io,Kind:RoleBinding,Name:sdk-builder-rolebinding,Namespace:,},SyncOperationResource{Group:networking.istio.io,Kind:Gateway,Name:sdk-builder-cluster,Namespace:,},SyncOperationResource{Group:networking.istio.io,Kind:VirtualService,Name:sdk-builder-leaderelection,Namespace:,},SyncOperationResource{Group:networking.istio.io,Kind:VirtualService,Name:sdk-builder-vs,Namespace:,},SyncOperationResource{Group:,Kind:Secret,Name:sdk-builder-env,Namespace:,},SyncOperationResource{Group:networking.istio.io,Kind:Gateway,Name:sdkmrfio-gw,Namespace:,},SyncOperationResource{Group:policy,Kind:PodDisruptionBudget,Name:sdk-builder-pdb,Namespace:,},SyncOperationResource{Group:,Kind:Namespace,Name:sdk-builder,Namespace:,},SyncOperationResource{Group:cert-manager.io,Kind:Certificate,Name:sdk-builder-cluster-https,Namespace:,},SyncOperationResource{Group:networking.istio.io,Kind:Gateway,Name:sdk-builder-mrfio,Namespace:,},SyncOperationResource{Group:networking.istio.io,Kind:ServiceEntry,Name:sdk-mrf-io-cloudfront,Namespace:,},SyncOperationResource{Group:networking.istio.io,Kind:Gateway,Name:sdk-builder-leaderelection,Namespace:,},SyncOperationResource{Group:networking.istio.io,Kind:DestinationRule,Name:sdk-mrfio-dest,Namespace:,},SyncOperationResource{Group:monitoring.coreos.com,Kind:ServiceMonitor,Name:appmetrics-sdk-builder,Namespace:,},SyncOperationResource{Group:networking.istio.io,Kind:DestinationRule,Name:sdk-builder,Namespace:,},SyncOperationResource{Group:rbac.authorization.k8s.io,Kind:Role,Name:sdk-builder-role,Namespace:,},SyncOperationResource{Group:cert-manager.io,Kind:Certificate,Name:sdk-builder.leaderelection.mrf.io,Namespace:,},SyncOperationResource{Group:cert-manager.io,Kind:Certificate,Name:sdkmrfio-https,Namespace:,},SyncOperationResource{Group:,Kind:Service,Name:sdk-builder,Namespace:,},SyncOperationResource{Group:autoscaling,Kind:HorizontalPodAutoscaler,Name:sdk-builder-autoscaler,Namespace:,},SyncOperationResource{Group:security.istio.io,Kind:AuthorizationPolicy,Name:sdk-builder,Namespace:,},},Source:nil,Manifests:[],SyncOptions:[ServerSideApply=true],Sources:[]ApplicationSource{},Revisions:[],} { true} [] {5 nil}}" application=argocd/sdk-builder | argocd-application-controller-0 |
time="2024-04-23T11:13:28Z" level=info msg="updated 'argocd/sdk-builder' operation (phase: Running)" | argocd-application-controller-0 |
time="2024-04-23T11:13:28Z" level=info msg=Syncing application=argocd/sdk-builder skipHooks=true started=false syncId=05403-RNPms | argocd-application-controller-1 |
time="2024-04-23T11:13:28Z" level=info msg="Comparing app state (cluster: https://kubernetes.default.svc, namespace: sdk-builder)" application=argocd/sdk-builder | argocd-application-controller-0 |
time="2024-04-23T11:13:28Z" level=info msg="manifest cache hit: &ApplicationSource{RepoURL:git@github.com:MyGitAccount/myNiceRepo.git,Path:./,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:nil,Plugin:&ApplicationSourcePlugin{Name:DecryptFromSopsForPreviewEnv,Env:[]*EnvEntry{},Parameters:[]ApplicationSourcePluginParameter{},},Chart:,Ref:,}/1e189c27189403969bd31706760e7a43c2fc833a" | argocd-repo-server-667cd4fb45-6q492 |
time="2024-04-23T11:13:28Z" level=info msg="Processing completed" resource=argocd/sdk-builder | argocd-notifications-controller-65cffbdbdd-qh8fk |
time="2024-04-23T11:13:28Z" level=info msg="Start processing" resource=argocd/sdk-builder | argocd-notifications-controller-65cffbdbdd-qh8fk |
time="2024-04-23T11:13:28Z" level=info msg="getRepoObjs stats" application=argocd/sdk-builder build_options_ms=0 helm_ms=0 plugins_ms=0 repo_ms=0 time_ms=7 unmarshal_ms=7 version_ms=0 | argocd-application-controller-0 |
time="2024-04-23T11:13:28Z" level=info msg="Reconciliation completed" application=argocd/sdk-builder dedup_ms=0 dest-name= dest-namespace=sdk-builder dest-server="https://kubernetes.default.svc" diff_ms=24 fields.level=1 git_ms=3635 health_ms=2 live_ms=6 settings_ms=0 sync_ms=0 time_ms=3848 | argocd-application-controller-0 |
time="2024-04-23T11:13:28Z" level=info msg="Update successful" application=argocd/sdk-builder | argocd-application-controller-0 |
time="2024-04-23T11:13:28Z" level=info msg="Tasks (dry-run)" application=argocd/sdk-builder syncId=05403-RNPms tasks="[Sync/0 resource /Namespace:/sdk-builder obj->nil (,,), Sync/0 resource policy/PodDisruptionBudget:sdk-builder/sdk-builder-pdb obj->nil (,,), Sync/0 resource /ServiceAccount:sdk-builder/sdk-builder obj->nil (,,), Sync/0 resource /Secret:sdk-builder/sdk-builder-env obj->nil (,,), Sync/0 resource rbac.authorization.k8s.io/Role:sdk-builder/sdk-builder-role obj->nil (,,), Sync/0 resource rbac.authorization.k8s.io/RoleBinding:sdk-builder/sdk-builder-rolebinding obj->nil (,,), Sync/0 resource /Service:sdk-builder/sdk-builder obj->nil (,,), Sync/0 resource apps/Deployment:sdk-builder/sdk-builder obj->nil (,,), Sync/0 resource autoscaling/HorizontalPodAutoscaler:sdk-builder/sdk-builder-autoscaler obj->nil (,,), Sync/0 resource monitoring.coreos.com/ServiceMonitor:sdk-builder/appmetrics-sdk-builder obj->nil (,,), Sync/0 resource security.istio.io/AuthorizationPolicy:sdk-builder/sdk-builder obj->nil (,,), Sync/0 resource networking.istio.io/DestinationRule:sdk-builder/sdk-builder obj->nil (,,), Sync/0 resource networking.istio.io/Gateway:sdk-builder/sdk-builder-cluster obj->nil (,,), Sync/0 resource cert-manager.io/Certificate:istio-system/sdk-builder-cluster-https obj->nil (,,), Sync/0 resource networking.istio.io/VirtualService:sdk-builder/sdk-builder-leaderelection obj->nil (,,), Sync/0 resource networking.istio.io/Gateway:sdk-builder/sdk-builder-leaderelection obj->nil (,,), Sync/0 resource networking.istio.io/Gateway:sdk-builder/sdk-builder-mrfio obj->nil (,,), Sync/0 resource cert-manager.io/Certificate:istio-system/sdk-builder-mrfio-https obj->nil (,,), Sync/0 resource networking.istio.io/VirtualService:sdk-builder/sdk-builder-vs obj->nil (,,), Sync/0 resource cert-manager.io/Certificate:istio-system/sdk-builder.leaderelection.mrf.io obj->nil (,,), Sync/0 resource networking.istio.io/ServiceEntry:sdk-builder/sdk-mrf-io-cloudfront obj->nil (,,), Sync/0 resource networking.istio.io/DestinationRule:sdk-builder/sdk-mrfio-dest obj->nil (,,), Sync/0 resource networking.istio.io/Gateway:sdk-builder/sdkmrfio-gw obj->nil (,,), Sync/0 resource cert-manager.io/Certificate:istio-system/sdkmrfio-https obj->nil (,,)]" | argocd-application-controller-1 |
time="2024-04-23T11:13:28Z" level=info msg="Updating operation state. phase: Running -> Running, message: '' -> 'one or more tasks are running'" application=argocd/sdk-builder syncId=05403-RNPms | argocd-application-controller-1 |
time="2024-04-23T11:13:28Z" level=info msg="Adding resource result, status: 'Pruned', phase: 'Succeeded', message: 'pruned'" application=argocd/sdk-builder kind=Role name=sdk-builder-role namespace=sdk-builder phase=Sync syncId=05403-RNPms | argocd-application-controller-1 |
time="2024-04-23T11:13:28Z" level=info msg="Comparing app state (cluster: https://kubernetes.default.svc, namespace: sdk-builder)" application=argocd/sdk-builder | argocd-application-controller-0 |
time="2024-04-23T11:13:28Z" level=info msg="Refreshing app status (controller refresh requested), level (1)" application=argocd/sdk-builder | argocd-application-controller-0 |
time="2024-04-23T11:13:28Z" level=info msg="Adding resource result, status: 'Pruned', phase: 'Succeeded', message: 'pruned'" application=argocd/sdk-builder kind=DestinationRule name=sdk-builder namespace=sdk-builder phase=Sync syncId=05403-RNPms | argocd-application-controller-1 |
time="2024-04-23T11:13:28Z" level=info msg="Adding resource result, status: 'Pruned', phase: 'Succeeded', message: 'pruned'" application=argocd/sdk-builder kind=ServiceAccount name=sdk-builder namespace=sdk-builder phase=Sync syncId=05403-RNPms | argocd-application-controller-1 |
time="2024-04-23T11:13:28Z" level=info msg="manifest cache hit: &ApplicationSource{RepoURL:git@github.com:MyGitAccount/myNiceRepo.git,Path:./,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:nil,Plugin:&ApplicationSourcePlugin{Name:DecryptFromSopsForPreviewEnv,Env:[]*EnvEntry{},Parameters:[]ApplicationSourcePluginParameter{},},Chart:,Ref:,}/1e189c27189403969bd31706760e7a43c2fc833a" | argocd-repo-server-667cd4fb45-89pb2 |
time="2024-04-23T11:13:28Z" level=info msg="Adding resource result, status: 'Pruned', phase: 'Succeeded', message: 'pruned'" application=argocd/sdk-builder kind=RoleBinding name=sdk-builder-rolebinding namespace=sdk-builder phase=Sync syncId=05403-RNPms | argocd-application-controller-1 |
time="2024-04-23T11:13:28Z" level=info msg="getRepoObjs stats" application=argocd/sdk-builder build_options_ms=0 helm_ms=0 plugins_ms=0 repo_ms=0 time_ms=8 unmarshal_ms=7 version_ms=0 | argocd-application-controller-0 |
time="2024-04-23T11:13:28Z" level=info msg="Adding resource result, status: 'Pruned', phase: 'Succeeded', message: 'pruned'" application=argocd/sdk-builder kind=Certificate name=sdk-builder-mrfio-https namespace=istio-system phase=Sync syncId=05403-RNPms | argocd-application-controller-1 |
time="2024-04-23T11:13:28Z" level=info msg="Adding resource result, status: 'Pruned', phase: 'Succeeded', message: 'pruned'" application=argocd/sdk-builder kind=Certificate name=sdkmrfio-https namespace=istio-system phase=Sync syncId=05403-RNPms | argocd-application-controller-1 |
time="2024-04-23T11:13:28Z" level=info msg="Adding resource result, status: 'Pruned', phase: 'Succeeded', message: 'pruned'" application=argocd/sdk-builder kind=Gateway name=sdkmrfio-gw namespace=sdk-builder phase=Sync syncId=05403-RNPms | argocd-application-controller-1 |
time="2024-04-23T11:13:28Z" level=info msg=Syncing application=argocd/sdk-builder skipHooks=true started=false syncId=05399-kEkVM | argocd-application-controller-0 |
After that the application enters an inestable status where argo is not able to prune all the resources but, at the same time, the deployment is pruned so, not able to respond to requests until we force a manual sync with Argo. We have some questions that we are not able to answer:
Finally, we are using a quite old ArgoCD version (2.6.6) and we are planning to update during the next weeks, even though, we haven't been able to find any bug fix that could address this problem. Does it exist? We have attached the full logs if it could help (argocd.csv).
argocd: v2.6.6+6d4de2e
BuildDate: 2023-03-16T22:25:45Z
GitCommit: 6d4de2ec5d49fa2c6823f2b7d101607a839be3fa
GitTreeState: clean
GoVersion: go1.18.10
Compiler: gc
Platform: linux/amd64
Many thanks for your help
Very sorry this happened, it was extremely destructive for us. I feel a lot less crazy now that someone else experienced it.
Similar to what you alluded to, the only semi-explanation I could come up with was redis returning empty cached manifests. Idk if it was caused by the repo server storing empty manifests because it swallowed an error while rendering or if something failed while storing the manifests in redis.
I also cannot explain how the "don't prune everything check" was bypassed.
To add to this, we have seen this bug happen in 2.10.7. For our situation, Argo showed a task dry-run with all cluster scoped resources managed by the Argo App as having a target of nil, but the non-cluster scoped (namespaced) resources rendered just fine. So, Argo pruned the cluster scoped resources like crds, cluster-role bindings, cluster roles.
Yep. Seeing this issue as well. In my case, it happened when trying to update an object which requires recreation, where simple update wont work. Ideally, it should have picked particular resource, but it deleted the whole appset.
FYI, the incorrect deletions of cluster-scoped resources should be fixed by https://github.com/argoproj/gitops-engine/pull/597, assuming it's approved, merged, and pulled into ArgoCD.
Checklist:
argocd version
.Describe the bug
We saw Argo CD mass prune an entire application incorrectly. Sadly, these resources caused a rather large cascading effect, deleting the majority of our Kubernetes deployments.
After a hard refresh, Argo CD correctly identified that the manifests it pruned, shouldn't have been pruned. Additionally, a git ls-files confirms the files that were pruned are on the commit.
The log message below is very similar to this issue #2573 and I have a feeling it's semi-related. We are planning on upgrading to 2.8 soon but wanted to see if anyone knows of an issue or fix that was put in v2.5.10+ already.
To Reproduce Been trying :-/
Expected behavior Manifests in Git shouldn't be pruned
Screenshots Attached are two screenshots showing the prune and then Argo CD saying they are out of sync after a hard sync.
Version
Logs