argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
18k stars 5.48k forks source link

Resource Hook Job will stuck at status `Running` indefinitely when `.spec.ttlSecondsAfterFinished` is set to `0` #18884

Open yz89122 opened 4 months ago

yz89122 commented 4 months ago

Checklist:

Describe the bug

Resource Hook Job will stuck at status Running indefinitely when .spec.ttlSecondsAfterFinished is set to 0.

To Reproduce

  1. Uses manifests from helm template oci://docker.io/envoyproxy/gateway-helm --version v1.0.2 -n envoy-gateway-system. In the generated manifests, there's a helm.sh/hook: pre-install, pre-upgrade Job.
# Source: gateway-helm/templates/certgen.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: release-name-gateway-helm-certgen
  namespace: 'envoy-gateway-system'
  labels:
    helm.sh/chart: gateway-helm-v1.0.2
    app.kubernetes.io/name: gateway-helm
    app.kubernetes.io/instance: release-name
    app.kubernetes.io/version: "v1.0.2"
    app.kubernetes.io/managed-by: Helm
  annotations:
    "helm.sh/hook": pre-install, pre-upgrade
spec:
  backoffLimit: 1
  completions: 1
  parallelism: 1
  template:
    metadata:
      labels:
        app: certgen
    spec:
      containers:
        - command:
            - envoy-gateway
            - certgen
          env:
            - name: ENVOY_GATEWAY_NAMESPACE
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.namespace
            - name: KUBERNETES_CLUSTER_DOMAIN
              value: cluster.local
          image: docker.io/envoyproxy/gateway:v1.0.2
          imagePullPolicy: Always
          name: envoy-gateway-certgen
      restartPolicy: Never
      securityContext:
        runAsGroup: 65534
        runAsNonRoot: true
        runAsUser: 65534
      serviceAccountName: release-name-gateway-helm-certgen
  ttlSecondsAfterFinished: 0
  1. In the Argo CD Web UI, click sync.

Expected behavior

Status change to Succeeded when the Job (resource hook) completed.

Screenshots

Screenshot 2024-07-02 at 11 28 58 AM

Version

argocd: v2.11.0+d3f33c0
  BuildDate: 2024-05-07T16:21:23Z
  GitCommit: d3f33c00197e7f1d16f2a73ce1aeced464b07175
  GitTreeState: clean
  GoVersion: go1.21.9
  Compiler: gc
  Platform: darwin/arm64
argocd-server: v2.11.0+d3f33c0
  BuildDate: 2024-05-07T16:01:41Z
  GitCommit: d3f33c00197e7f1d16f2a73ce1aeced464b07175
  GitTreeState: clean
  GoVersion: go1.21.9
  Compiler: gc
  Platform: linux/amd64
  Kustomize Version: v5.2.1 2023-10-19T20:13:51Z
  Helm Version: v3.14.3+gf03cc04
  Kubectl Version: v0.26.11
  Jsonnet Version: v0.20.0
andrii-korotkov-verkada commented 4 months ago

Does it reproduce with ttlSecondsAfterFinished other than 0?

mavendonovanhubbard commented 2 months ago

This looks like a duplicate of issue #6880

mavendonovanhubbard commented 2 months ago

Does it reproduce with ttlSecondsAfterFinished other than 0?

No it does not. I created this chart to reproduce the error. https://github.com/mavendonovanhubbard/hook-chart I'm using argocd core v2.12.3. Here is my app manifest

project: default
source:
  repoURL: 'https://github.com/mavendonovanhubbard/hook-chart'
  path: .
  targetRevision: main
  helm:
    values: 'ttlSecondsAfterFinished: 0'
destination:
  server: 'https://kubernetes.default.svc'
  namespace: hook-chart