IBM / cloudpak-gitops

Deployment of IBM Cloud Paks using ArgoCD / Red Hat GitOps operator.
Apache License 2.0
22 stars 23 forks source link

Parent Application for new installations sometimes times out before installation completes #284

Closed nastacio closed 10 months ago

nastacio commented 1 year ago

Describe the bug I am following the instructions under https://github.com/IBM/cloudpak-gitops/blob/main/docs/install.md and when I try and install Cloud Pak for Integration, the parent cp4i-app application sometimes times out.

To Reproduce Steps to reproduce the behavior:

  1. Follow the instructions under https://github.com/IBM/cloudpak-gitops/blob/main/docs/install.md to install Cloud Pak for Integration
  2. After an hour or so, sometimes, the parent cp4i-app is marked as failed with an error (see Additional Context below)

Expected behavior Installation should not fail like that, especially considering that the children Application resources completed without issue.

Screenshots If applicable, add screenshots to help explain your problem.

image

Additional context

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  annotations:
    argocd.argoproj.io/sync-wave: '100'
  name: cp4i-app
  namespace: openshift-gitops
  labels:
    app.kubernetes.io/instance: cp4i-app
spec:
  destination:
    namespace: kr-two
    server: 'https://kubernetes.default.svc'
...
  project: default
  source:
    helm:
      parameters:
...
    path: config/argocd-cloudpaks/cp4i
    repoURL: 'https://github.com/IBM/cloudpak-gitops'
    targetRevision: main
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
status:
  conditions:
    - lastTransitionTime: '2023-10-11T04:53:48Z'
      message: >-
        Failed sync attempt to ae9ab1dc313fbe38578717563fe82b658306cd75: one or
        more synchronization tasks completed unsuccessfully (retried 5 times).
      type: SyncError
  health:
    status: Progressing
  operationState:
    finishedAt: '2023-10-11T04:53:48Z'
    message: >-
      one or more synchronization tasks completed unsuccessfully (retried 5
      times).
    operation:
      initiatedBy:
        automated: true
      retry:
        limit: 5
      sync:
        revision: ae9ab1dc313fbe38578717563fe82b658306cd75
    phase: Failed
    retryCount: 5
    startedAt: '2023-10-11T04:46:37Z'
    syncResult:
      resources:
        - syncPhase: PreSync
          message: job.batch/presync-cp4i-storage-classes created
          name: presync-cp4i-storage-classes
          kind: Job
          version: v1
          hookPhase: Succeeded
          namespace: openshift-gitops
          hookType: PreSync
          group: batch
        - syncPhase: Sync
          message: application.argoproj.io/cp4i-app configured
          name: cp4i-app
          status: Synced
          kind: Application
          version: v1alpha1
          hookPhase: Failed
          namespace: openshift-gitops
          group: argoproj.io
      revision: ae9ab1dc313fbe38578717563fe82b658306cd75
      source:
        helm:
          parameters:
            - name: argocd_app_name
              value: '${ARGOCD_APP_NAME}'
            - name: argocd_app_namespace
              value: '${ARGOCD_APP_NAMESPACE}'
            - name: metadata.argocd_app_namespace
              value: kr-two
            - name: modules.apic
              value: 'false'
            - name: modules.client
              value: 'false'
            - name: modules.mq
              value: 'true'
            - name: repoURL
              value: '${ARGOCD_APP_SOURCE_REPO_URL}'
            - name: serviceaccount.argocd_application_controller
              value: openshift-gitops-argocd-application-controller
            - name: storageclass.rwo
              value: rook-cephfs
            - name: storageclass.rwx
              value: rook-cephfs
            - name: targetRevision
              value: '${ARGOCD_APP_SOURCE_TARGET_REVISION}'
        path: config/argocd-cloudpaks/cp4i
        repoURL: 'https://github.com/IBM/cloudpak-gitops'
        targetRevision: main
  reconciledAt: '2023-10-11T15:08:10Z'
  resources:
    - group: argoproj.io
      kind: Application
      name: cp4i-app
      namespace: openshift-gitops
      status: Synced
      version: v1alpha1
    - group: argoproj.io
      health:
        status: Missing
      kind: Application
      name: cp4i-mq
      namespace: openshift-gitops
      status: OutOfSync
      version: v1alpha1
    - group: argoproj.io
      health:
        status: Progressing
      kind: Application
      name: cp4i-platform
      namespace: openshift-gitops
      status: Synced
      version: v1alpha1
    - group: argoproj.io
      health:
        status: Healthy
      kind: Application
      name: cp4i-prereqs
      namespace: openshift-gitops
      status: Synced
      version: v1alpha1
  sourceType: Helm
  summary: {}
  sync:
    comparedTo:
      destination:
        namespace: kr-two
        server: 'https://kubernetes.default.svc'
      source:
        helm:
          parameters:
          ...
        path: config/argocd-cloudpaks/cp4i
        repoURL: 'https://github.com/IBM/cloudpak-gitops'
        targetRevision: main
    revision: ae9ab1dc313fbe38578717563fe82b658306cd75
    status: OutOfSync
nastacio commented 1 year ago

Docs have these as the default retry value:

 --retry-backoff-duration duration       Retry backoff base duration. Input needs to be a duration (e.g. 2m, 1h) (default 5s)
      --retry-backoff-factor int              Factor multiplies the base duration after each failed retry (default 2)
      --retry-backoff-max-duration duration   Max retry backoff duration. Input needs to be a duration (e.g. 2m, 1h) (default 3m0s)