Open quadrinho opened 1 year ago
Hello,
can anyone help me?
@crenshaw-dev can you help me?
Can you post the Application
, to be able to better discern what's going on?
Can you post the
Application
, to be able to better discern what's going on?
Hello @blakepettersson,
below the Application:
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
annotations:
argocd.argoproj.io/manifest-generate-paths: .
argocd.argoproj.io/sync-wave: "4"
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"argoproj.io/v1alpha1","kind":"Application","metadata":{"annotations":{"argocd.argoproj.io/manifest-generate-paths":".","argocd.argoproj.io/sync-wave":"4"},"finalizers":["resources-finalizer.argocd.argoproj.io"],"labels":{"argocd.argoproj.io/instance":"rootapps-dev-mobile56-platform","project":"platformapps-dev-mobile56"},"name":"external-dns-mobile56-dev","namespace":"argocd"},"spec":{"destination":{"namespace":"platform","server":"https://36F950CAB58ED67F6DF6F037ACAD9074.gr7.eu-central-1.eks.amazonaws.com"},"project":"platformapps-dev-mobile56","source":{"helm":{"valueFiles":["values/platform/dev-mobile56.yaml","values/environment/dev.yaml"]},"path":"external-dns","repoURL":"https://XXXXXXX","targetRevision":"update/eks-1.24"},"syncPolicy":{"automated":{"prune":true,"selfHeal":true},"managedNamespaceMetadata":{"labels":{"pod-security.kubernetes.io/audit":"privileged","pod-security.kubernetes.io/audit-version":"latest","pod-security.kubernetes.io/enforce":"privileged","pod-security.kubernetes.io/enforce-version":"v1.24","pod-security.kubernetes.io/warn":"privileged","pod-security.kubernetes.io/warn-version":"latest"}},"syncOptions":["CreateNamespace=true","ApplyOutOfSyncOnly=true","PruneLast=true"]}}}
creationTimestamp: "2023-07-19T09:14:44Z"
finalizers:
- resources-finalizer.argocd.argoproj.io
generation: 67117
labels:
argocd.argoproj.io/instance: rootapps-dev-mobile56-platform
project: platformapps-dev-mobile56
name: external-dns-mobile56-dev
namespace: argocd
resourceVersion: "151294717"
uid: be61a824-bfbd-4f26-aadc-21588926bb1d
operation:
initiatedBy:
automated: true
retry:
limit: 5
sync:
prune: true
revision: 2685c4c3ce9f554fe661f2e8df36de32ded65f63
syncOptions:
- CreateNamespace=true
- ApplyOutOfSyncOnly=true
- PruneLast=true
spec:
destination:
namespace: platform
server: https://36F950CAB58ED67F6DF6F037ACAD9074.gr7.eu-central-1.eks.amazonaws.com
project: platformapps-dev-mobile56
source:
helm:
valueFiles:
- values/platform/dev-mobile56.yaml
- values/environment/dev.yaml
path: external-dns
repoURL: https://XXXXXXX
targetRevision: update/eks-1.24
syncPolicy:
automated:
prune: true
selfHeal: true
managedNamespaceMetadata:
labels:
pod-security.kubernetes.io/audit: privileged
pod-security.kubernetes.io/audit-version: latest
pod-security.kubernetes.io/enforce: privileged
pod-security.kubernetes.io/enforce-version: v1.24
pod-security.kubernetes.io/warn: privileged
pod-security.kubernetes.io/warn-version: latest
syncOptions:
- CreateNamespace=true
- ApplyOutOfSyncOnly=true
- PruneLast=true
status:
health:
status: Healthy
history:
- deployStartedAt: "2023-07-28T07:35:05Z"
deployedAt: "2023-07-28T07:36:01Z"
id: 4062
revision: 2685c4c3ce9f554fe661f2e8df36de32ded65f63
source:
helm:
valueFiles:
- values/platform/dev-mobile56.yaml
- values/environment/dev.yaml
path: external-dns
repoURL: https://XXXXXXX
targetRevision: update/eks-1.24
- deployStartedAt: "2023-07-28T07:36:41Z"
deployedAt: "2023-07-28T07:37:47Z"
id: 4063
revision: 2685c4c3ce9f554fe661f2e8df36de32ded65f63
source:
helm:
valueFiles:
- values/platform/dev-mobile56.yaml
- values/environment/dev.yaml
path: external-dns
repoURL: https://XXXXXXX
targetRevision: update/eks-1.24
- deployStartedAt: "2023-07-28T07:38:28Z"
deployedAt: "2023-07-28T07:39:31Z"
id: 4064
revision: 2685c4c3ce9f554fe661f2e8df36de32ded65f63
source:
helm:
valueFiles:
- values/platform/dev-mobile56.yaml
- values/environment/dev.yaml
path: external-dns
repoURL: https://XXXXXXX
targetRevision: update/eks-1.24
- deployStartedAt: "2023-07-28T07:40:09Z"
deployedAt: "2023-07-28T07:41:05Z"
id: 4065
revision: 2685c4c3ce9f554fe661f2e8df36de32ded65f63
source:
helm:
valueFiles:
- values/platform/dev-mobile56.yaml
- values/environment/dev.yaml
path: external-dns
repoURL: https://XXXXXXX
targetRevision: update/eks-1.24
- deployStartedAt: "2023-07-28T07:41:47Z"
deployedAt: "2023-07-28T07:42:43Z"
id: 4066
revision: 2685c4c3ce9f554fe661f2e8df36de32ded65f63
source:
helm:
valueFiles:
- values/platform/dev-mobile56.yaml
- values/environment/dev.yaml
path: external-dns
repoURL: https://XXXXXXX
targetRevision: update/eks-1.24
- deployStartedAt: "2023-07-28T07:43:24Z"
deployedAt: "2023-07-28T07:44:28Z"
id: 4067
revision: 2685c4c3ce9f554fe661f2e8df36de32ded65f63
source:
helm:
valueFiles:
- values/platform/dev-mobile56.yaml
- values/environment/dev.yaml
path: external-dns
repoURL: https://XXXXXXX
targetRevision: update/eks-1.24
- deployStartedAt: "2023-07-28T07:45:14Z"
deployedAt: "2023-07-28T07:46:12Z"
id: 4068
revision: 2685c4c3ce9f554fe661f2e8df36de32ded65f63
source:
helm:
valueFiles:
- values/platform/dev-mobile56.yaml
- values/environment/dev.yaml
path: external-dns
repoURL: https://XXXXXXX
targetRevision: update/eks-1.24
- deployStartedAt: "2023-07-28T07:46:52Z"
deployedAt: "2023-07-28T07:47:56Z"
id: 4069
revision: 2685c4c3ce9f554fe661f2e8df36de32ded65f63
source:
helm:
valueFiles:
- values/platform/dev-mobile56.yaml
- values/environment/dev.yaml
path: external-dns
repoURL: https://XXXXXXX
targetRevision: update/eks-1.24
- deployStartedAt: "2023-07-28T07:48:43Z"
deployedAt: "2023-07-28T07:49:48Z"
id: 4070
revision: 2685c4c3ce9f554fe661f2e8df36de32ded65f63
source:
helm:
valueFiles:
- values/platform/dev-mobile56.yaml
- values/environment/dev.yaml
path: external-dns
repoURL: https://XXXXXXX
targetRevision: update/eks-1.24
- deployStartedAt: "2023-07-28T07:50:31Z"
deployedAt: "2023-07-28T07:51:35Z"
id: 4071
revision: 2685c4c3ce9f554fe661f2e8df36de32ded65f63
source:
helm:
valueFiles:
- values/platform/dev-mobile56.yaml
- values/environment/dev.yaml
path: external-dns
repoURL: https://XXXXXXX
targetRevision: update/eks-1.24
operationState:
message: waiting for healthy state of rbac.authorization.k8s.io/ClusterRole/external-dns-mobile56-dev
and 4 more resources
operation:
initiatedBy:
automated: true
retry:
limit: 5
sync:
prune: true
revision: 2685c4c3ce9f554fe661f2e8df36de32ded65f63
syncOptions:
- CreateNamespace=true
- ApplyOutOfSyncOnly=true
- PruneLast=true
phase: Running
startedAt: "2023-07-28T07:52:18Z"
syncResult:
resources:
- group: ""
hookPhase: Running
kind: Namespace
message: namespace/platform serverside-applied
name: platform
namespace: ""
status: Synced
syncPhase: PreSync
version: v1
revision: 2685c4c3ce9f554fe661f2e8df36de32ded65f63
source:
helm:
valueFiles:
- values/platform/dev-mobile56.yaml
- values/environment/dev.yaml
path: external-dns
repoURL: https://XXXXXXX
targetRevision: update/eks-1.24
reconciledAt: "2023-07-28T07:52:30Z"
resources:
- health:
status: Healthy
kind: Service
name: external-dns-mobile56-dev
namespace: platform
status: Synced
version: v1
- kind: ServiceAccount
name: external-dns
namespace: platform
status: Synced
version: v1
- group: apps
health:
status: Healthy
kind: Deployment
name: external-dns-mobile56-dev
namespace: platform
status: Synced
version: v1
- group: iam.aws.upbound.io
kind: Policy
name: dev-mobile56-external-dns
status: Synced
version: v1beta1
- group: iam.aws.upbound.io
kind: Role
name: dev-mobile56-external-dns
status: Synced
version: v1beta1
- group: iam.aws.upbound.io
kind: RolePolicyAttachment
name: dev-mobile56-external-dns
status: Synced
version: v1beta1
- group: rbac.authorization.k8s.io
kind: ClusterRole
name: external-dns-mobile56-dev
status: Synced
version: v1
- group: rbac.authorization.k8s.io
kind: ClusterRoleBinding
name: external-dns-mobile56-dev-viewer
status: Synced
version: v1
sourceType: Helm
summary:
images:
- XXXXXXX/external-dns/external-dns:v0.12.2
sync:
comparedTo:
destination:
namespace: platform
server: https://36F950CAB58ED67F6DF6F037ACAD9074.gr7.eu-central-1.eks.amazonaws.com
source:
helm:
valueFiles:
- values/platform/dev-mobile56.yaml
- values/environment/dev.yaml
path: external-dns
repoURL: https://XXXXXXX
targetRevision: update/eks-1.24
revision: 2685c4c3ce9f554fe661f2e8df36de32ded65f63
status: OutOfSync
hello @blakepettersson and @crenshaw-dev
can you help me?
Looks to me like the status is OutOfSync because technically (even though all the resources are "synced,") the sync operation hasn't ended.
operationState:
message: waiting for healthy state of rbac.authorization.k8s.io/ClusterRole/external-dns-mobile56-dev
and 4 more resources
operation:
initiatedBy:
automated: true
retry:
limit: 5
sync:
prune: true
revision: 2685c4c3ce9f554fe661f2e8df36de32ded65f63
syncOptions:
- CreateNamespace=true
- ApplyOutOfSyncOnly=true
- PruneLast=true
phase: Running
Hello @crenshaw-dev ,
yes i am agree with you but why?
After some time the application looks like Healthy and Synced but after some seconds the OutOfSync appears again...
i really don't follow why :(
Do you have any idea?
Thanks!!!
I only have a hunch: I suspect that PruneLast=true
is causing the app to include resource health in the sync operation.
I've also seen PostSync hooks extend the sync context until resources are healthy.
Thank you for the answer @crenshaw-dev! You understand you are saying the sync is in progress because application's resources are "waiting for healthy state of ..".
You are suggesting the try following:
PruneLast=false
Correct!
Hello @crenshaw-dev ,
i have tried to set PruneLast=false
both in the app in apps and in the specific app:
and
But the application aws-load-balancer-controller is continuing to have "Last Sync" in Syncing
I really don't know where to hit my head anymore.
Can you help me?
Thanks a lot!
I was running into the same issue, and was able to fix it by removing the CreateNamespace=true
sync option and all managed namespace metadata from my affected applications (instead I am now using a K8s manifest to manage the namespace). I am not using the PruneLast=true
sync option at all.
So an application like this:
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: my-app
spec:
... some omitted fields
destination:
server: https://kubernetes.default.svc
namespace: my-namespace
syncPolicy:
syncOptions:
- CreateNamespace=true
managedNamespaceMetadata:
labels:
foo: bar
would be replaced by this:
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: my-app
spec:
... some omitted fields
destination:
server: https://kubernetes.default.svc
namespace: my-namespace
---
apiVersion: v1
kind: Namespace
metadata:
name: my-namespace
labels:
foo: bar
I assume that this issue is caused by the server-side-apply mechanism used for the managedNamespaceMetadata
, but I was not able to verify that assumption. For more context: I am running ArgoCD on AKS (k8s version 1.26.6) and the issue only became visible after upgrading ArgoCD to v2.8.x.
@sym-stiller any chance you are using rancher here ? i'm facing a similar issue and i suspect that rancher altering metadata of the namespace can be the cause.
@fredleger No, I'm not using Rancher.
But afaik AKS also alters namespace labels. It adds the kubernetes.io/metadata.name
label to each namespace (but this could also be standard K8s behavior, idk).
This is why I'm assuming that the server-side-apply performed by ArgoCD is the root cause of the problem. There will always be a difference between the managedNamespaceMetadata and the actual metadata, caused by the additional labels injected by AKS (or Rancher in your case).
This PR seems to be fixing the issue in v2.8.4 and above: https://github.com/argoproj/argo-cd/pull/15488
The applications .spec.syncPolicy.managedNamespaceMetadata
does not match its .status.operationState.syncResult.managedNamespaceMetadata
. When this is the case (as of v2.8.0), the application will show OutOfSync and have a blank App Diff. Since the application has auto-sync enabled, ArgoCD is likely performing this sync to populate the tracking field (.status.operationState.syncResult.managedNamespaceMetadata
).
I suggest verifying the ArgoCD CRDs in your cluster are up to date for your ArgoCD version; specifically, the application CRD needs to define this tracking field.
If the application CRDs are up to date, it seems that something is deleting the managedNamespaceMetadata tracking data, which is causing ArgoCD to try to add it back. To verify this, you could populate the tracking fields yourself, then check to see if they remain populated or if something is causing them to be deleted.
We ended up rolling back the ArgoCD 2.8.6 upgrade due to this, as the application-controller performance tanked after the upgrade. We tried doubling the number of ArgoCD application-controller shards before rolling back, but the application-controllers were still not able to keep up with all of the repeated reconciles. Additionally, this caused argocd-server to scale out to maxReplicas and triggered various alerts.
We've now completed the upgrade to 2.8.6 successfully; all that was required was to sync in all of the ArgoCD applications, so that this tracking field was populated. Syncing ArgoCD applications was not possible in the first upgrade attempt, because some applications had other pending changes that could not be synced at that time. Also, most of the other applications are the responsibility of different teams.
It may be worth adding a note to the 2.8 upgrade guide stating that the CRD updates are required when using managedNamespaceMetadata, and if autosync isn't enabled you will need to manually sync these apps after the upgrade in order for the performance of the argocd-application-controller to go back to normal. Specifically, this would allow companies to inventory applications that are out-of-sync before the upgrade and allow requesting proper approval or cross-team communication to happen before taking on the upgrade.
Some of our environments do not use managedNamespaceMetadata, so we initially missed this when testing the upgrade process.
It may be worth adding a note to the 2.8 upgrade guide stating that the CRD updates are required when using managedNamespaceMetadata, and if autosync isn't enabled you will need to manually sync these apps after the upgrade in order for the performance of the argocd-application-controller to go back to normal. Specifically, this would allow companies to inventory applications that are out-of-sync before the upgrade and allow requesting proper approval or cross-team communication to happen before taking on the upgrade.
That's a good point, I can add that - although it must be said that CRD updates are ~pretty much required~ highly recommended for all major/minor Argo CD upgrades.
That's a good point, I can add that - although it must be said that CRD updates are ~pretty much required~ highly recommended for all major/minor Argo CD upgrades.
FWIW, my CRDs were updated with each upgrade as I use ArgoCD to mange itself and its CRDs.
Additionally, I think the problem I faced was exacerbated due to resource status updates that were triggering refreshes outside of the standard reconciliation loop (#13912). We will resolve these misconfigurations in the future, maybe others do not have this problem as well.
Checklist:
argocd version
.Describe the bug
Hello,
in argocd i can see that an application (correctly working in the target cluster) have the Sync Status in OutOfSync, after some seconds the SyncStatus become Synced and the last Sync is in Syncing; after some seconds the SyncStatus go back to OutOfSync without any change on github.
It seems a sort of "loop"
The target cluster is EKS version 1.24 (but the problem is present also with 1.25). I haven't the same problem in EKS 1.23
To Reproduce
Install an application inside the cluster (i am using app of apps - please see the print below)
Expected behavior
The application when it is deployed and Synced doesn't need to go OutOfSync if i don't change anything
Screenshots APPS in APP
OutOfSync:
Synced:
Version
ArgoCD version --> v2.9.0+b90f3bc Helm Chart Version --> 5.40.0
Logs
I paste here the describe of the application that is continue to go from OutOfSync to Synced and OutOfSync again