Open rajivml opened 2 years ago
The same with envoy-gateway
@JuniorJPDJ Try:
patches:
- target:
name: eg-gateway-helm-certgen
kind: Job
patch: |
- path: "/spec/ttlSecondsAfterFinished"
op: remove
@imroc how am I supposed to use it with helm?
@JuniorJPDJ
You need kustomize, if you must use helm, you can combine helm with kustomize, use helmCharts
in kustomization.yaml
to include envoygateway chart, and also add patches in kustomization.yaml
.
I figured it out - envoy gateway allows setting this parameter directly on helm values - no need for kustomize:
certgen:
job:
ttlSecondsAfterFinished: ~
Or set it to 60
and it doesn't bug ArgoCD too.
FYI - present in v2.10.7 but not in v2.10.4.
I'm experiencing the same challenge on production.
Interestingly, I ran an application locally and all the hooks - PreSync, PostSync and SyncFail seem to have worked without tweaking anything. Wondering why????
FYI - Running into the same issue with longhorn on ArgoCD v2.10.1+a79e0ea.
edit: found this relevant to the longhorn chart. it seems they added a value that can be set to false in order to work with argocd
I'm using v2.11.0+f287dab and hit the same problem with whatever version of kube-prometheus-stack from 45.0.0 on. Last testing was with the most recent one, 58.0.0, and still facing the issue. Sadly, no workaround worked for me yet
Maybe a ttl > 0 is needed also in https://github.com/prometheus-community/helm-charts/blob/9c41858ac9714483638d78fb560577dc37e55875/charts/kube-prometheus-stack/templates/prometheus-operator/admission-webhooks/job-patch/job-createSecret.yaml#L19
just according to others who changed the ttl. Maybe someone can shed some light on this why changing the ttl helps in some charts?
@jkleinlercher if I understand the issue, it seems that when the TTL is set to 0 ArgoCD does not have the time to detect the job succedeed and waits for it to finish indefinitely.
I also asked again in https://cloud-native.slack.com/archives/C01TSERG0KZ/p1714376880925509 about this issue. Would be happy if we could get some experts in to tell if "ttlSecondsAfterFinished: 0" in a hook job is a problem (or under which circumstances it could be a problem). Maybe then some off-the-shelf helm-charts could be reconfigured to solve this problem for everyone.
I no recognized that the 'ttlSecondsAfterFinished: 0' cannot be the root cause because this is not set in our situation. in kube-prometheus-stack there is a API condition around this setting which is not met in current clusters:
So there must be another root cause for the "stuck in pending deletion" situation.
okay locally on k3d cluster I can recreate the problem with kube-prometheus-stack but argocd didn't stuck in "pending deletion" but "running", although the job already finished and is already deleted and
kubectl get job -n monitoring sx-kube-prometheus-stack-admission-create
Error from server (NotFound): jobs.batch "sx-kube-prometheus-stack-admission-create" not found
I wonder if ServerSideApply has something to do with it ...
Reporting the same problem on a k3s cluster trying to install the kube-prometheus-stack Helm chart latest version
Meanwhile I came across https://github.com/argoproj/argo-cd/issues/15292 and I wonder if this is the same problem. In
is the same deletion policy as in the issue mentioned … sadly that the referenced PR https://github.com/argoproj/gitops-engine/pull/461 was never merged. @leoluz or @nazarewk is there some chance to help out on this?
Same problem now using Argo v2.11.0-rc3+20fd621 with current kube-prometheus-stack
Same problem here 5 different clusters (K3S, Kubespray & Managed) on none I can deploy the current kube-prometheus-stack (Chart v58.3.1) using Argo-CD 2.10.9
Issue has been fixed
https://github.com/prometheus-community/helm-charts/pull/4510
The job job-createSecret.yaml
does not complete syncing in 58.3.2
. Setting prometheusOperator.admissionWebhooks.patch.ttlSecondsAfterFinished
to 30s helped me solve the problem.
Still facing the issue when trying to deploy kube prom stack helm chart version 58.6.1
where prometheusOperator.admissionWebhooks.patch.ttlSecondsAfterFinished
is set to 60
Argocd version: v2.9.0+9cf0c69
Still facing the issue when trying to deploy kube prom stack helm chart version
58.6.1
whereprometheusOperator.admissionWebhooks.patch.ttlSecondsAfterFinished
is set to60
Argocd version:v2.9.0+9cf0c69
![]()
I have the same issue, same configuration used fo ttlSecondsAfterFinished, tested 60 and 30 seconds.
prometheusOperator:
enabled: true
admissionWebhooks:
patch:
enabled: true
ttlSecondsAfterFinished: 30
@jsantosa-minsait Is by chance you have enabled istio sidecar injection ? that causes the pod to complete the patching and creation but the pod keeps on running.
@jsantosa-minsait Is by chance you have enabled istio sidecar injection ? that causes the pod to complete the patching and creation but the pod keeps on running.
Hi @prashant0085 no, I don't. I have Cilium installed and Kyverno with admissions controllers hooks that may alter or patch the resource. However this is not the case.
Hi, experiencing this issue as well. Environment: Openshift 4.14, Argocd v2.10.10+9b3d0c0, Postsynchook job. Just a simple kustomization.yaml with 2 resources and a postsynchookjob No Helm. The job actually takes about 2 minutes to complete, but argocd only sets the job as finished after approx 10 minutes. Tried different settings (0, 60, 120) on ttlSecondsAfterFinished in the job spec, but no change in behaviour Also monitored the memory and cpu usage of the argocd pods (controller, repo, applicationsetcontroller, etc), no pod even comes close to limitcpu or limitmemory, so no issue there...
HI,
We are seeing this issue quite often where app sync is getting stuck in "waiting for completion of hook" and these hooks are never getting completed
As you can see the below application got stuck on secret creation phase and some how that secret never got created
Stripped out all un-necessary details. Now this is how the secret is created and used by the job.
kubectl -n argocd logs argocd-server-768f46f469-j98h6 | grep xxx-migrations - No matching logs kubectl -n argocd logs argocd-repo-server-57bdbf899c-9lxhr | grep xxx-migrations - No matching logs kubectl -n argocd logs argocd-repo-server-57bdbf899c-7xvs7 | grep xxx-migrations - No matching logs kubectl -n argocd logs argocd-server-768f46f469-tqp8p | grep xxx-migrations - No matching logs
[testadmin@server0 ~]$ kubectl -n argocd logs argocd-application-controller-0 | grep orchestrator-migrations time="2021-08-02T02:16:25Z" level=info msg="Resuming in-progress operation. phase: Running, message: waiting for completion of hook /Secret/xxx-migrations-0.0.19-private4.1784494" application=xxx time="2021-08-02T02:16:25Z" level=info msg="Resuming in-progress operation. phase: Running, message: waiting for completion of hook /Secret/xxx-migrations-0.0.19-private4.1784494" application=xxx time="2021-08-02T02:19:25Z" level=info msg="Resuming in-progress operation. phase: Running, message: waiting for completion of hook /Secret/xxx-migrations-0.0.19-private4.1784494" application=xxx time="2021-08-02T02:19:26Z" level=info msg="Resuming in-progress operation. phase: Running, message: waiting for completion of hook /Secret/xxx-migrations-0.0.19-private4.1784494" application=xxx time="2021-08-02T02:22:17Z" level=info msg="Resuming in-progress operation. phase: Running, message: waiting for completion of hook /Secret/xxx-migrations-0.0.19-private4.1784494" application=xxx time="2021-08-02T02:22:17Z" level=info msg="Resuming in-progress operation. phase: Running, message: waiting for completion of hook /Secret/xxx-migrations-0.0.19-private4.1784494" application=xxx time="2021-08-02T02:22:25Z" level=info msg="Resuming in-progress operation. phase: Running, message: waiting for completion of hook /Secret/xxx-migrations-0.0.19-private4.1784494" application=xxx time="2021-08-02T02:25:25Z" level=info msg="Resuming in-progress operation. phase: Running, message: waiting for completion of hook /Secret/xxx-migrations-0.0.19-private4.1784494" application=xxx time="2021-08-02T02:25:25Z" level=info msg="Resuming in-progress operation. phase: Running, message: waiting for completion of hook /Secret/xxx-migrations-0.0.19-private4.1784494" application=xxx time="2021-08-02T02:28:25Z" level=info msg="Resuming in-progress operation. phase: Running, message: waiting for completion of hook /Secret/xxx-migrations-0.0.19-private4.1784494" application=xxx time="2021-08-02T02:28:26Z" level=info msg="Resuming in-progress operation. phase: Running, message: waiting for completion of hook /Secret/xxx-migrations-0.0.19-private4.1784494" application=xxx time="2021-08-02T02:31:25Z" level=info msg="Resuming in-progress operation. phase: Running, message: waiting for completion of hook /Secret/xxx-migrations-0.0.19-private4.1784494" application=xxx time="2021-08-02T02:31:26Z" level=info msg="Resuming in-progress operation. phase: Running, message: waiting for completion of hook /Secret/xxx-migrations-0.0.19-private4.1784494" application=xxx
Environment:
ArgoCD Version: 2.0.1
Please let me know in case of any other info required