Open yellowhat opened 1 year ago
I am experiencing the same. Every 12 hours, I get about 40 or so errors that all say err="context canceled"
Most of the instance that these errors show are after attempting to sync an externally managed cluster. The cluster does sync eventually, but these errors are initially thrown.
Time | Host
-----------------------------
14:39:19 UTC | aks-general-00000-vmss000002-argocd
"Watch failed" err="context canceled"
-----------------------------
14:39:19 UTC | aks-general-00000-vmss000002-argocd
"Watch failed" err="context canceled"
-----------------------------
14:39:19 UTC | aks-general-00000-vmss000002-argocd
"Watch failed" err="context canceled"
-----------------------------
14:39:19 UTC | aks-general-00000-vmss000002-argocd
"Watch failed" err="context canceled"
-----------------------------
14:39:19 UTC | aks-general-00000-vmss000002-argocd
"Watch failed" err="context canceled"
-----------------------------
14:39:19 UTC | aks-general-00000-vmss000002-argocd
"Watch failed" err="context canceled"
-----------------------------
14:39:19 UTC | aks-general-00000-vmss000002-argocd
"Watch failed" err="context canceled"
-----------------------------
14:39:19 UTC | aks-general-00000-vmss000002-argocd
"Watch failed" err="context canceled"
-----------------------------
14:39:19 UTC | aks-general-00000-vmss000002-argocd
"Watch failed" err="context canceled"
-----------------------------
14:39:19 UTC | aks-general-00000-vmss000002-argocd
"Watch failed" err="context canceled"
We are still seeing this issue in argocd 2.11.2 that is causing deployment outages for some of our users. We have 1 installation with multiple controller that manage about 40+ clusters
This might be unrelated but if you are using a limited rbac for argocd app controller instead of the admin rbac with permission to all resources on cluster you might want to either manually put resource inclusions/exclusions or use the respectRBAC feature available to automatically let argocd figure which resources it has access to and needs to monitor/watch.
Ref:
We are also seeing 75-200 of these logs entries from each application controller every 12 hours on v2.11.3. The timing correlates with the cluster's cache age dropping to 0:
Here's a zoomed-in look at a 15 minute window:
I don't know what this correlation means but thought it might be worth sharing.
this morning found controller log was logging this error for whole night every second:
E0918 05:28:13.841461 7 retrywatcher.go:130] "Watch failed" err="context canceled" E0918 05:28:14.842231 7 retrywatcher.go:130] "Watch failed" err="context canceled" E0918 05:28:15.842669 7 retrywatcher.go:130] "Watch failed" err="context canceled"
there are problems with my argocd, but this does not help to identify the cause
Checklist:
argocd version
.Describe the bug
Describe the bug
Hi, I am using the
argo-cd
5.46.2
helm chart.I have noticed that every 12 hours the
application-controller
throws the following error:According to this discussion some
watch
permission are missing.Currently the role associated the
application-controller
service account haswatch
onsecrets
andconfigmaps
:Is there something else missing?
To Reproduce
Expected behavior
No error
Version
Logs