Open neizmirasego opened 4 months ago
I've related the same issue in here :(
We have enabled more debug logs:
apiVersion: v1
kind: ConfigMap
metadata:
name: argocd-cmd-params-cm
labels:
app.kubernetes.io/name: argocd-cmd-params-cm
app.kubernetes.io/part-of: argocd
data:
controller.log.level: "debug"
terminationGracePeriodSeconds: 30
serviceAccountName: argocd-application-controller
containers:
- args:
- /usr/local/bin/argocd-application-controller
- --gloglevel
- "4"
and found more logs which do not exists on healthy cluster
"Failed to get event! Re-creating the watcher." resourceVersion="301181841"
Restarting RetryWatcher at RV="301181583"
"Watch failed" err="unknown"
Both are from k8s go client.
Same/ Similar issue https://github.com/argoproj/argo-cd/issues/18467
Checklist:
argocd version
.Describe the bug
Absolutely randomly application controller hangs and stop processing apps. Not enough observability to identify the root cause. In monitoring workqueue from 0 reaches the total number apps. Only application controller restart helps. In logs there are no error messages. Debug mode is enabled. The only simptom in logs is increasing of grpc.time_ms from ~0.1 to ~2000 in repo server logs. In controller logs we see several "Watch failed", but they also appears when issue is not reprodusible.
Repo server logs:
Application controller logs:
To Reproduce
Nothing special, just deploy argocd, create applications, source is internal self-hosted gitlab, destination is openshift.
Expected behavior
All allications need to be processed and deployed.
Screenshots
Version