fluxcd / helm-controller

The GitOps Toolkit Helm reconciler, for declarative Helming
https://fluxcd.io
Apache License 2.0
407 stars 160 forks source link

Helm Controller fails to process helmreleases #153

Closed nab-gha closed 3 years ago

nab-gha commented 3 years ago

I have noticed this scenario repeatedly on a number of different clusters, Helm Controller seems to stop doing periodic reconcilations. Deleting the pod fixes this.

After upgrading to v0.3.0 I noted that some helmreleases were reporting issues

$ kubectl get -A helmreleases.helm.toolkit.fluxcd.io
NAMESPACE                NAME                                              READY   STATUS                                                                                            AGE
apps                     microservice-1                                    True    Release reconciliation succeeded                                                                  20h
base                     microservice-1                                    False   HelmChart 'gotk-system/base-microservice-1' is not ready                                          20h
bootstrap                integration-test                                  True    Release reconciliation succeeded                                                                  20h
bootstrap                microservice-1                                    True    Release reconciliation succeeded                                                                  20h
bootstrap                microservice-2                                    True    Release reconciliation succeeded                                                                  20h
cluster-addons           pr172922-aks-helmchart-cert-manager               True    Release reconciliation succeeded                                                                  10d
cluster-addons           pr172922-aks-helmchart-cp-poll                    True    Release reconciliation succeeded                                                                  10d
cluster-addons           pr172922-aks-helmchart-datadog                    False   HelmChart 'cluster-addons/cluster-addons-pr172922-aks-helmchart-datadog' is not ready             10d
cluster-addons           pr172922-aks-helmchart-disable-updates            True    Release reconciliation succeeded                                                                  10d
cluster-addons           pr172922-aks-helmchart-externaldns                False   HelmChart 'cluster-addons/cluster-addons-pr172922-aks-helmchart-externaldns' is not ready         10d
cluster-addons           pr172922-aks-helmchart-nginx-ingress-controller   True    Release reconciliation succeeded                                                                  10d
cluster-addons           pr172922-aks-helmchart-rolessetup                 False   HelmChart 'cluster-addons/cluster-addons-pr172922-aks-helmchart-rolessetup' is not ready          10d
cluster-addons           pr172922-aks-helmchart-tenancyoperator            False   HelmChart 'cluster-addons/cluster-addons-pr172922-aks-helmchart-tenancyoperator' is not ready     10d
cluster-addons           pr172922-aks-helmchart-vaultregistration          False   HelmChart 'cluster-addons/cluster-addons-pr172922-aks-helmchart-vaultregistration' is not ready   10d
cluster-addons           pr172922-aks-kubestatemetrics                     False   HelmChart 'cluster-addons/cluster-addons-pr172922-aks-kubestatemetrics' is not ready              10d
mgmt                     microservice-1                                    True    Release reconciliation succeeded                                                                  20h
vault-injector-webhook   vault-injector-webhook                            False   HelmChart 'cluster-addons/vault-injector-webhook-vault-injector-webhook' is not ready             10d

Note it has been over three minutes since last reconcile yet HelmRelease intervals are set to 1 or 3 minute. So I deleted the pod

a669981@vc2crtp2473106n:~$ kubectl -n gotk-system delete pod helm-controller-6b8747c4dc-shkdr
pod "helm-controller-6b8747c4dc-shkdr" deleted
a669981@vc2crtp2473106n:~$ kubectl get -A helmreleases.helm.toolkit.fluxcd.io
NAMESPACE                NAME                                              READY   STATUS                                                                                            AGE
apps                     microservice-1                                    True    Release reconciliation succeeded                                                                  20h
base                     microservice-1                                    False   HelmChart 'gotk-system/base-microservice-1' is not ready                                          20h
bootstrap                integration-test                                  True    Release reconciliation succeeded                                                                  20h
bootstrap                microservice-1                                    True    Release reconciliation succeeded                                                                  20h
bootstrap                microservice-2                                    True    Release reconciliation succeeded                                                                  20h
cluster-addons           pr172922-aks-helmchart-cert-manager               True    Release reconciliation succeeded                                                                  10d
cluster-addons           pr172922-aks-helmchart-cp-poll                    True    Release reconciliation succeeded                                                                  10d
cluster-addons           pr172922-aks-helmchart-datadog                    False   HelmChart 'cluster-addons/cluster-addons-pr172922-aks-helmchart-datadog' is not ready             10d
cluster-addons           pr172922-aks-helmchart-disable-updates            True    Release reconciliation succeeded                                                                  10d
cluster-addons           pr172922-aks-helmchart-externaldns                False   HelmChart 'cluster-addons/cluster-addons-pr172922-aks-helmchart-externaldns' is not ready         10d
cluster-addons           pr172922-aks-helmchart-nginx-ingress-controller   True    Release reconciliation succeeded                                                                  10d
cluster-addons           pr172922-aks-helmchart-rolessetup                 False   HelmChart 'cluster-addons/cluster-addons-pr172922-aks-helmchart-rolessetup' is not ready          10d
cluster-addons           pr172922-aks-helmchart-tenancyoperator            False   HelmChart 'cluster-addons/cluster-addons-pr172922-aks-helmchart-tenancyoperator' is not ready     10d
cluster-addons           pr172922-aks-helmchart-vaultregistration          False   HelmChart 'cluster-addons/cluster-addons-pr172922-aks-helmchart-vaultregistration' is not ready   10d
cluster-addons           pr172922-aks-kubestatemetrics                     False   HelmChart 'cluster-addons/cluster-addons-pr172922-aks-kubestatemetrics' is not ready              10d
mgmt                     microservice-1                                    True    Release reconciliation succeeded                                                                  20h
vault-injector-webhook   vault-injector-webhook                            False   HelmChart 'cluster-addons/vault-injector-webhook-vault-injector-webhook' is not ready             10d
$ kubectl get -A helmreleases.helm.toolkit.fluxcd.io
NAMESPACE                NAME                                              READY   STATUS                             AGE
apps                     microservice-1                                    True    Release reconciliation succeeded   20h
base                     microservice-1                                    True    Release reconciliation succeeded   20h
bootstrap                integration-test                                  True    Release reconciliation succeeded   20h
bootstrap                microservice-1                                    True    Release reconciliation succeeded   20h
bootstrap                microservice-2                                    True    Release reconciliation succeeded   20h
cluster-addons           pr172922-aks-helmchart-cert-manager               True    Release reconciliation succeeded   10d
cluster-addons           pr172922-aks-helmchart-cp-poll                    True    Release reconciliation succeeded   10d
cluster-addons           pr172922-aks-helmchart-datadog                    True    Release reconciliation succeeded   10d
cluster-addons           pr172922-aks-helmchart-disable-updates            True    Release reconciliation succeeded   10d
cluster-addons           pr172922-aks-helmchart-externaldns                True    Release reconciliation succeeded   10d
cluster-addons           pr172922-aks-helmchart-nginx-ingress-controller   True    Release reconciliation succeeded   10d
cluster-addons           pr172922-aks-helmchart-rolessetup                 True    Release reconciliation succeeded   10d
cluster-addons           pr172922-aks-helmchart-tenancyoperator            True    Release reconciliation succeeded   10d
cluster-addons           pr172922-aks-helmchart-vaultregistration          True    Release reconciliation succeeded   10d
cluster-addons           pr172922-aks-kubestatemetrics                     True    Release reconciliation succeeded   10d
mgmt                     microservice-1                                    True    Release reconciliation succeeded   20h
vault-injector-webhook   vault-injector-webhook                            True    Release reconciliation succeeded   10d

logs : https://gist.github.com/paulcarlton-ww/2f22967692e2ab25a5c8f0a47435b058

nab-gha commented 3 years ago

Another occurrence, on my kind cluster after upgrade to v0.3.0, have not deleted this pod yet

kubectl get -A helmcharts.source.toolkit.fluxcd.io 
NAMESPACE     NAME                         CHART     VERSION   SOURCE KIND      SOURCE NAME   READY   STATUS                    AGE
gotk-system   apps-microservice-1          podinfo   >4.0.0    HelmRepository   podinfo       True    Fetched revision: 5.0.3   15h
gotk-system   base-microservice-1          podinfo   >4.0.0    HelmRepository   podinfo       True    Fetched revision: 5.0.3   15h
gotk-system   bootstrap-integration-test   podinfo   >4.0.0    HelmRepository   podinfo       True    Fetched revision: 5.0.3   15h
gotk-system   bootstrap-microservice-1     podinfo   >4.0.0    HelmRepository   podinfo       True    Fetched revision: 5.0.3   15h
gotk-system   bootstrap-microservice-2     podinfo   >4.0.0    HelmRepository   podinfo       True    Fetched revision: 5.0.3   15h
gotk-system   mgmt-microservice-1          podinfo   >4.0.0    HelmRepository   podinfo       True    Fetched revision: 5.0.3   15h
pcarlton@pcarlton3:~/go/src/github.com/fidelity/kraan$ kubectl get -A helmreleases.helm.toolkit.fluxcd.io 
NAMESPACE    NAME               READY   STATUS                                                       AGE
apps         microservice-1     False   HelmChart 'gotk-system/apps-microservice-1' is not ready     15h
base         microservice-1     False   HelmChart 'gotk-system/base-microservice-1' is not ready     15h
bootstrap    integration-test   True    Release reconciliation succeeded                             15h
bootstrap    microservice-1     True    Release reconciliation succeeded                             15h
bootstrap    microservice-2     True    Release reconciliation succeeded                             15h
kraan-test   microservice       False   chart reconciliation failed: namespaces "simple" not found   15m
kraan-test   microservice-two   False   chart reconciliation failed: namespaces "simple" not found   15m
mgmt         microservice-1     True    Release reconciliation succeeded                             15h
pcarlton@pcarlton3:~/go/src/github.com/fidelity/kraan$ kubectl get helmreleases.helm.toolkit.fluxcd.io -n apps microservice-1  -o json
{
    "apiVersion": "helm.toolkit.fluxcd.io/v2beta1",
    "kind": "HelmRelease",
    "metadata": {
        "annotations": {
            "kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"helm.toolkit.fluxcd.io/v2beta1\",\"kind\":\"HelmRelease\",\"metadata\":{\"annotations\":{},\"name\":\"microservice-1\",\"namespace\":\"apps\"},\"spec\":{\"chart\":{\"spec\":{\"chart\":\"podinfo\",\"sourceRef\":{\"kind\":\"HelmRepository\",\"name\":\"podinfo\",\"namespace\":\"gotk-system\"},\"version\":\"\\u003e4.0.0\"}},\"install\":{\"remediation\":{\"retries\":-1}},\"interval\":\"1m0s\",\"test\":{\"enable\":false,\"ignoreFailures\":false,\"timeout\":\"2m\"},\"upgrade\":{\"remediation\":{\"retries\":-1}},\"values\":{\"podinfo\":{\"message\":\"-Microservice Test 1\",\"replicaCount\":1,\"service\":{\"enabled\":true,\"type\":\"ClusterIP\"}},\"preHookActiveDeadlineSeconds\":60,\"preHookBackoffLimit\":1,\"preHookDelaySeconds\":10,\"preHookRestartPolicy\":\"Never\",\"preHookSucceed\":\"true\",\"testHookActiveDeadlineSeconds\":60,\"testHookBackoffLimit\":1,\"testHookDelaySeconds\":10,\"testHookRestartPolicy\":\"Never\",\"testHookSucceed\":\"true\"}}}\n"
        },
        "creationTimestamp": "2020-11-23T18:09:39Z",
        "finalizers": [
            "finalizers.fluxcd.io"
        ],
        "generation": 1,
        "labels": {
            "kraan/layer": "apps"
        },
        "managedFields": [
            {
                "apiVersion": "helm.toolkit.fluxcd.io/v2beta1",
                "fieldsType": "FieldsV1",
                "fieldsV1": {
                    "f:metadata": {
                        "f:annotations": {
                            ".": {},
                            "f:kubectl.kubernetes.io/last-applied-configuration": {}
                        },
                        "f:labels": {
                            ".": {},
                            "f:kraan/layer": {}
                        },
                        "f:ownerReferences": {
                            ".": {},
                            "k:{\"uid\":\"9a477c19-43bc-4451-8eaf-e2c7b58f06bf\"}": {
                                ".": {},
                                "f:apiVersion": {},
                                "f:blockOwnerDeletion": {},
                                "f:controller": {},
                                "f:kind": {},
                                "f:name": {},
                                "f:uid": {}
                            }
                        }
                    },
                    "f:spec": {
                        ".": {},
                        "f:chart": {
                            ".": {},
                            "f:spec": {
                                ".": {},
                                "f:chart": {},
                                "f:sourceRef": {
                                    ".": {},
                                    "f:kind": {},
                                    "f:name": {},
                                    "f:namespace": {}
                                },
                                "f:version": {}
                            }
                        },
                        "f:install": {
                            ".": {},
                            "f:remediation": {
                                ".": {},
                                "f:retries": {}
                            }
                        },
                        "f:interval": {},
                        "f:test": {
                            ".": {},
                            "f:timeout": {}
                        },
                        "f:upgrade": {
                            ".": {},
                            "f:remediation": {
                                ".": {},
                                "f:retries": {}
                            }
                        },
                        "f:values": {}
                    }
                },
                "manager": "kraan-controller",
                "operation": "Update",
                "time": "2020-11-23T18:09:39Z"
            },
            {
                "apiVersion": "helm.toolkit.fluxcd.io/v2beta1",
                "fieldsType": "FieldsV1",
                "fieldsV1": {
                    "f:spec": {
                        "f:test": {
                            "f:enable": {},
                            "f:ignoreFailures": {}
                        }
                    }
                },
                "manager": "kubectl",
                "operation": "Update",
                "time": "2020-11-23T18:09:39Z"
            },
            {
                "apiVersion": "helm.toolkit.fluxcd.io/v2beta1",
                "fieldsType": "FieldsV1",
                "fieldsV1": {
                    "f:metadata": {
                        "f:finalizers": {
                            ".": {},
                            "v:\"finalizers.fluxcd.io\"": {}
                        }
                    },
                    "f:status": {
                        ".": {},
                        "f:conditions": {},
                        "f:failures": {},
                        "f:helmChart": {},
                        "f:lastAppliedRevision": {},
                        "f:lastAttemptedRevision": {},
                        "f:lastAttemptedValuesChecksum": {},
                        "f:lastReleaseRevision": {},
                        "f:observedGeneration": {}
                    }
                },
                "manager": "helm-controller",
                "operation": "Update",
                "time": "2020-11-24T08:24:20Z"
            }
        ],
        "name": "microservice-1",
        "namespace": "apps",
        "ownerReferences": [
            {
                "apiVersion": "kraan.io/v1alpha1",
                "blockOwnerDeletion": true,
                "controller": true,
                "kind": "AddonsLayer",
                "name": "apps",
                "uid": "9a477c19-43bc-4451-8eaf-e2c7b58f06bf"
            }
        ],
        "resourceVersion": "336839",
        "selfLink": "/apis/helm.toolkit.fluxcd.io/v2beta1/namespaces/apps/helmreleases/microservice-1",
        "uid": "a3f7bfdd-d1cd-45f8-b2a6-178680fc8dbb"
    },
    "spec": {
        "chart": {
            "spec": {
                "chart": "podinfo",
                "sourceRef": {
                    "kind": "HelmRepository",
                    "name": "podinfo",
                    "namespace": "gotk-system"
                },
                "version": "\u003e4.0.0"
            }
        },
        "install": {
            "remediation": {
                "retries": -1
            }
        },
        "interval": "1m0s",
        "test": {
            "timeout": "2m0s"
        },
        "upgrade": {
            "remediation": {
                "retries": -1
            }
        },
        "values": {
            "podinfo": {
                "message": "-Microservice Test 1",
                "replicaCount": 1,
                "service": {
                    "enabled": true,
                    "type": "ClusterIP"
                }
            },
            "preHookActiveDeadlineSeconds": 60,
            "preHookBackoffLimit": 1,
            "preHookDelaySeconds": 10,
            "preHookRestartPolicy": "Never",
            "preHookSucceed": "true",
            "testHookActiveDeadlineSeconds": 60,
            "testHookBackoffLimit": 1,
            "testHookDelaySeconds": 10,
            "testHookRestartPolicy": "Never",
            "testHookSucceed": "true"
        }
    },
    "status": {
        "conditions": [
            {
                "lastTransitionTime": "2020-11-24T08:24:20Z",
                "message": "HelmChart 'gotk-system/apps-microservice-1' is not ready",
                "reason": "ArtifactFailed",
                "status": "False",
                "type": "Ready"
            },
            {
                "lastTransitionTime": "2020-11-23T18:09:42Z",
                "message": "Helm install succeeded",
                "reason": "InstallSucceeded",
                "status": "True",
                "type": "Released"
            }
        ],
        "failures": 1,
        "helmChart": "gotk-system/apps-microservice-1",
        "lastAppliedRevision": "5.0.3",
        "lastAttemptedRevision": "5.0.3",
        "lastAttemptedValuesChecksum": "23736cfb27573949efbf5314de584043d81b5745",
        "lastReleaseRevision": 1,
        "observedGeneration": 1
    }
}
pcarlton@pcarlton3:~/go/src/github.com/fidelity/kraan$ kubectl get -A helmreleases.helm.toolkit.fluxcd.io 
NAMESPACE    NAME               READY   STATUS                                                       AGE
apps         microservice-1     False   HelmChart 'gotk-system/apps-microservice-1' is not ready     15h
base         microservice-1     False   HelmChart 'gotk-system/base-microservice-1' is not ready     15h
bootstrap    integration-test   True    Release reconciliation succeeded                             15h
bootstrap    microservice-1     True    Release reconciliation succeeded                             15h
bootstrap    microservice-2     True    Release reconciliation succeeded                             15h
kraan-test   microservice       False   chart reconciliation failed: namespaces "simple" not found   16m
kraan-test   microservice-two   False   chart reconciliation failed: namespaces "simple" not found   16m
mgmt         microservice-1     True    Release reconciliation succeeded                             15h
pcarlton@pcarlton3:~/go/src/github.com/fidelity/kraan$ kubectl get al
NAME        VERSION        SOURCE          PATH                            STATUS     REASON
apps        0.0.07         addons-config   ./testdata/addons/apps          Deployed   AddonsLayer version 0.0.07 is Deployed
base        0.0.07         addons-config   ./testdata/addons/base          Deployed   AddonsLayer version 0.0.07 is Deployed
bootstrap   0.0.07         addons-config   ./testdata/addons/bootstrap     Deployed   AddonsLayer version 0.0.07 is Deployed
mgmt        0.0.07         addons-config   ./testdata/addons/mgmt          Deployed   AddonsLayer version 0.0.07 is Deployed
test        test-version   test            ./testdata/crds/test_crd.yaml   Failed     AddonsLayer processsing has failed

pcarlton@pcarlton3:~/go/src/github.com/fidelity/kraan$ kubectl get -A helmreleases.helm.toolkit.fluxcd.io 
NAMESPACE    NAME               READY   STATUS                                                       AGE
apps         microservice-1     False   HelmChart 'gotk-system/apps-microservice-1' is not ready     15h
base         microservice-1     False   HelmChart 'gotk-system/base-microservice-1' is not ready     15h
bootstrap    integration-test   True    Release reconciliation succeeded                             15h
bootstrap    microservice-1     True    Release reconciliation succeeded                             15h
bootstrap    microservice-2     True    Release reconciliation succeeded                             15h
kraan-test   microservice       False   chart reconciliation failed: namespaces "simple" not found   17m
kraan-test   microservice-two   False   chart reconciliation failed: namespaces "simple" not found   17m
mgmt         microservice-1     True    Release reconciliation succeeded   

logs https://gist.github.com/paulcarlton-ww/27f9ee74b6831971e23b4f871af28895

nab-gha commented 3 years ago

source controller logs https://gist.github.com/paulcarlton-ww/d7b7e495fcf901ce5e9aa03d17f91205

nab-gha commented 3 years ago

deleted helm-controller pod

kubectl get -A helmcharts.source.toolkit.fluxcd.io 
NAMESPACE     NAME                         CHART     VERSION   SOURCE KIND      SOURCE NAME   READY   STATUS                    AGE
gotk-system   apps-microservice-1          podinfo   >4.0.0    HelmRepository   podinfo       True    Fetched revision: 5.0.3   16h
gotk-system   base-microservice-1          podinfo   >4.0.0    HelmRepository   podinfo       True    Fetched revision: 5.0.3   16h
gotk-system   bootstrap-integration-test   podinfo   >4.0.0    HelmRepository   podinfo       True    Fetched revision: 5.0.3   16h
gotk-system   bootstrap-microservice-1     podinfo   >4.0.0    HelmRepository   podinfo       True    Fetched revision: 5.0.3   16h
gotk-system   bootstrap-microservice-2     podinfo   >4.0.0    HelmRepository   podinfo       True    Fetched revision: 5.0.3   16h
gotk-system   mgmt-microservice-1          podinfo   >4.0.0    HelmRepository   podinfo       True    Fetched revision: 5.0.3   16h
pcarlton@pcarlton3:~/go/src/github.com/fidelity/kraan$ kubectl get -A helmreleases.helm.toolkit.fluxcd.io 
NAMESPACE   NAME               READY   STATUS                             AGE
apps        microservice-1     True    Release reconciliation succeeded   16h
base        microservice-1     True    Release reconciliation succeeded   16h
bootstrap   integration-test   True    Release reconciliation succeeded   16h
bootstrap   microservice-1     True    Release reconciliation succeeded   16h
bootstrap   microservice-2     True    Release reconciliation succeeded   16h
mgmt        microservice-1     True    Release reconciliation succeeded   16h

source controller logs: https://gist.github.com/paulcarlton-ww/3abd7ec6316e3db158394daa5d851cb8 helm controller logs: https://gist.github.com/paulcarlton-ww/a107143c60d08be15cad212a89d958c8

PrivatePuffin commented 3 years ago

Still seeing this on latest release @stefanprodan

stefanprodan commented 3 years ago

@Ornias1993 I doubt it's the same issue, please open a new one with logs and explain how to reproduce it.

PrivatePuffin commented 3 years ago

@stefanprodan I already did. At flux that is, as i've no idea what the actual culprid of my issue is.

(and I honestly also have no time to figure out how the underlaying flux organisation is managed when it comes to seperate repo's and operators.)