Closed nalbury closed 2 years ago
Hello @nalbury, can you post kubectl describe helmrepository
and kubectl describe helmrelease
here please?
Also can you post the source-controller logs as well please?
As requested (had to redact some account IDs and some of the values as they're work specific):
HelmRepostitory:
Name: ecr
Namespace: flux-system
Labels: kustomize.toolkit.fluxcd.io/name=sources
kustomize.toolkit.fluxcd.io/namespace=flux-system
Annotations: reconcile.fluxcd.io/requestedAt: 2022-06-20T12:51:19.602886928Z
API Version: source.toolkit.fluxcd.io/v1beta2
Kind: HelmRepository
Metadata:
Creation Timestamp: 2022-06-09T14:57:39Z
Finalizers:
finalizers.fluxcd.io
Generation: 5
Managed Fields:
API Version: source.toolkit.fluxcd.io/v1beta2
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:labels:
f:kustomize.toolkit.fluxcd.io/name:
f:kustomize.toolkit.fluxcd.io/namespace:
f:spec:
f:interval:
f:secretRef:
f:name:
f:type:
f:url:
Manager: kustomize-controller
Operation: Apply
Time: 2022-06-09T17:55:56Z
API Version: source.toolkit.fluxcd.io/v1beta2
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
.:
v:"finalizers.fluxcd.io":
f:status:
f:conditions:
f:lastHandledReconcileAt:
f:observedGeneration:
Manager: source-controller
Operation: Update
Time: 2022-06-10T11:32:36Z
API Version: source.toolkit.fluxcd.io/v1beta2
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:reconcile.fluxcd.io/requestedAt:
Manager: flux
Operation: Update
Time: 2022-06-10T11:34:24Z
Resource Version: 13055348
UID: 9e31f644-3840-4a7e-a34e-fbbc96e403c0
Spec:
Interval: 5m0s
Secret Ref:
Name: ecr-auth
Timeout: 60s
Type: oci
URL: oci://<redacted>.dkr.ecr.us-west-2.amazonaws.com/helm
Status:
Conditions:
Last Transition Time: 2022-06-16T13:52:02Z
Message: Helm repository is ready
Observed Generation: 5
Reason: Succeeded
Status: True
Type: Ready
Last Handled Reconcile At: 2022-06-20T12:51:19.602886928Z
Observed Generation: 5
Events: <none>
HelmRelease
Name: my-app
Namespace: my-namespace
Labels: kustomize.toolkit.fluxcd.io/name=my-namespace
kustomize.toolkit.fluxcd.io/namespace=flux-system
Annotations: reconcile.fluxcd.io/requestedAt: 2022-06-20T12:51:56.444579522Z
API Version: helm.toolkit.fluxcd.io/v2beta1
Kind: HelmRelease
Metadata:
Creation Timestamp: 2022-06-09T15:05:42Z
Finalizers:
finalizers.fluxcd.io
Generation: 5
Managed Fields:
API Version: helm.toolkit.fluxcd.io/v2beta1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:labels:
f:kustomize.toolkit.fluxcd.io/name:
f:kustomize.toolkit.fluxcd.io/namespace:
f:spec:
f:chart:
f:spec:
f:chart:
f:sourceRef:
f:kind:
f:name:
f:namespace:
f:version:
f:install:
f:remediation:
f:retries:
f:interval:
f:releaseName:
f:targetNamespace:
f:timeout:
f:values:
Manager: kustomize-controller
Operation: Apply
Time: 2022-06-10T13:00:17Z
API Version: helm.toolkit.fluxcd.io/v2beta1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:reconcile.fluxcd.io/requestedAt:
Manager: flux
Operation: Update
Time: 2022-06-20T12:52:19Z
API Version: helm.toolkit.fluxcd.io/v2beta1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
.:
v:"finalizers.fluxcd.io":
f:status:
f:conditions:
f:failures:
f:helmChart:
f:lastAppliedRevision:
f:lastAttemptedRevision:
f:lastAttemptedValuesChecksum:
f:lastHandledReconcileAt:
f:lastReleaseRevision:
f:observedGeneration:
Manager: helm-controller
Operation: Update
Time: 2022-06-21T00:01:06Z
Resource Version: 19441662
UID: 313cd68e-e8bd-4bab-b98b-352bce3d7a64
Spec:
Chart:
Spec:
Chart: my-chart
Reconcile Strategy: ChartVersion
Source Ref:
Kind: HelmRepository
Name: ecr
Namespace: flux-system
Version: 0.2.0
Install:
Remediation:
Retries: 3
Interval: 1m
Release Name: my-app
Target Namespace: my-namespace
Timeout: 10m0s
Values:
Image:
Tag: my-app-1.0.19
Status:
Conditions:
Last Transition Time: 2022-06-21T00:01:06Z
Message: HelmChart 'flux-system/my-namespace-my-app' is not ready
Reason: ArtifactFailed
Status: False
Type: Ready
Last Transition Time: 2022-06-20T12:52:27Z
Message: Helm upgrade succeeded
Reason: UpgradeSucceeded
Status: True
Type: Released
Failures: 9337
Helm Chart: flux-system/my-namespace-my-app
Last Applied Revision: 0.2.0
Last Attempted Revision: 0.2.0
Last Attempted Values Checksum: 6dd0181482c19c5a1d858af61822bc1e954ac809
Last Handled Reconcile At: 2022-06-20T12:51:56.444579522Z
Last Release Revision: 4
Observed Generation: 5
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal info 3m30s (x21603 over 17d) helm-controller HelmChart 'flux-system/my-namespace-my-app' is not ready
Source Controller logs:
{"level":"info","ts":"2022-06-27T11:55:44.970Z","logger":"controller.gitrepository","msg":"no changes since last reconcilation: observed revision 'master/c2be202685fd4c5218d6da49d9eff23480ce7d2f'","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"GitRepository","name":"flux-system","namespace":"flux-system"}
{"level":"info","ts":"2022-06-27T11:56:11.803Z","logger":"controller.helmchart","msg":"artifact up-to-date with remote revision: '1.4.0'","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"HelmChart","name":"kube-system-aws-load-balancer-controller","namespace":"flux-system"}
{"level":"info","ts":"2022-06-27T11:56:12.217Z","logger":"controller.helmchart","msg":"artifact up-to-date with remote revision: 'v3.22.0'","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"HelmChart","name":"calico-system-calico","namespace":"flux-system"}
{"level":"info","ts":"2022-06-27T11:56:13.836Z","logger":"controller.helmchart","msg":"artifact up-to-date with remote revision: 'v2.3.1'","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"HelmChart","name":"kyverno-kyverno","namespace":"flux-system"}
{"level":"info","ts":"2022-06-27T11:56:45.205Z","logger":"controller.gitrepository","msg":"no changes since last reconcilation: observed revision 'master/c2be202685fd4c5218d6da49d9eff23480ce7d2f'","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"GitRepository","name":"flux-system","namespace":"flux-system"}
{"level":"error","ts":"2022-06-27T11:56:46.008Z","logger":"controller.helmchart","msg":"Reconciler error","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"HelmChart","name":"my-namespace-my-app","namespace":"flux-system","error":"chart pull error: chart pull error: failed to get chart version for remote reference: GET \"https://<redacted>.dkr.ecr.us-west-2.amazonaws.com/v2/helm/my-chart/tags/list\": unexpected status code 403: denied: Your authorization token has expired. Reauthenticate and try again."}
{"level":"info","ts":"2022-06-27T11:57:11.826Z","logger":"controller.helmchart","msg":"artifact up-to-date with remote revision: '1.4.0'","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"HelmChart","name":"kube-system-aws-load-balancer-controller","namespace":"flux-system"}
{"level":"info","ts":"2022-06-27T11:57:12.222Z","logger":"controller.helmchart","msg":"artifact up-to-date with remote revision: 'v3.22.0'","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"HelmChart","name":"calico-system-calico","namespace":"flux-system"}
{"level":"info","ts":"2022-06-27T11:57:13.854Z","logger":"controller.helmchart","msg":"artifact up-to-date with remote revision: 'v2.3.1'","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"HelmChart","name":"kyverno-kyverno","namespace":"flux-system"}
Thanks @nalbury we have identified the issue. Working on fixing this.
Amazing thank you!
We'll probably have to use @souleb's fork of Helm until this gets merged: https://github.com/helm/helm/pull/11086
I have tested the fix with the following scenarios
@nalbury do you have the possibility to test the fix? See #799
Yup deployed an image built from your branch this morning. Should be able to verify this evening once the currently loaded token expires.
While attempting to set up ECR as an OCI chart repo, we followed the recommended pattern here to configure a Kube secret with the required registry credentials for the OCI repo, but noticed that the source controller only seems to fetch this secret once on boot. This unfortunately means that once the ECR token expires, the source controller needs to be restarted before authentication will work again and the repo/charts can be reconciled.
Example of the state post expiration:
I can login to ECR via the helm cli with the data in the kube secret
But if I look at the status of an ECR hosted helm chart there's a chart pull error saying the token has expired
If I restart the source-controller (delete the pod), then the secret is seemingly reloaded on boot and the chart can reconcile again until the newly loaded token has expired
I know the recommended pattern linked above is from the documentation for the image automation controllers, so wondering if the source-controller is supposed to operate in the same way? It was mentioned here that some caching may be at play.