fluxcd / source-controller

The GitOps Toolkit source management component
https://fluxcd.io
Apache License 2.0
240 stars 187 forks source link

[helm-oci] ECR auth expires #787

Closed nalbury closed 2 years ago

nalbury commented 2 years ago

While attempting to set up ECR as an OCI chart repo, we followed the recommended pattern here to configure a Kube secret with the required registry credentials for the OCI repo, but noticed that the source controller only seems to fetch this secret once on boot. This unfortunately means that once the ECR token expires, the source controller needs to be restarted before authentication will work again and the repo/charts can be reconciled.

Example of the state post expiration:

I know the recommended pattern linked above is from the documentation for the image automation controllers, so wondering if the source-controller is supposed to operate in the same way? It was mentioned here that some caching may be at play.

souleb commented 2 years ago

Hello @nalbury, can you post kubectl describe helmrepository and kubectl describe helmrelease here please?

Also can you post the source-controller logs as well please?

nalbury commented 2 years ago

As requested (had to redact some account IDs and some of the values as they're work specific):

HelmRepostitory:

Name:         ecr
Namespace:    flux-system
Labels:       kustomize.toolkit.fluxcd.io/name=sources
              kustomize.toolkit.fluxcd.io/namespace=flux-system
Annotations:  reconcile.fluxcd.io/requestedAt: 2022-06-20T12:51:19.602886928Z
API Version:  source.toolkit.fluxcd.io/v1beta2
Kind:         HelmRepository
Metadata:
  Creation Timestamp:  2022-06-09T14:57:39Z
  Finalizers:
    finalizers.fluxcd.io
  Generation:  5
  Managed Fields:
    API Version:  source.toolkit.fluxcd.io/v1beta2
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:labels:
          f:kustomize.toolkit.fluxcd.io/name:
          f:kustomize.toolkit.fluxcd.io/namespace:
      f:spec:
        f:interval:
        f:secretRef:
          f:name:
        f:type:
        f:url:
    Manager:      kustomize-controller
    Operation:    Apply
    Time:         2022-06-09T17:55:56Z
    API Version:  source.toolkit.fluxcd.io/v1beta2
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:finalizers:
          .:
          v:"finalizers.fluxcd.io":
      f:status:
        f:conditions:
        f:lastHandledReconcileAt:
        f:observedGeneration:
    Manager:      source-controller
    Operation:    Update
    Time:         2022-06-10T11:32:36Z
    API Version:  source.toolkit.fluxcd.io/v1beta2
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:reconcile.fluxcd.io/requestedAt:
    Manager:         flux
    Operation:       Update
    Time:            2022-06-10T11:34:24Z
  Resource Version:  13055348
  UID:               9e31f644-3840-4a7e-a34e-fbbc96e403c0
Spec:
  Interval:  5m0s
  Secret Ref:
    Name:   ecr-auth
  Timeout:  60s
  Type:     oci
  URL:      oci://<redacted>.dkr.ecr.us-west-2.amazonaws.com/helm
Status:
  Conditions:
    Last Transition Time:     2022-06-16T13:52:02Z
    Message:                  Helm repository is ready
    Observed Generation:      5
    Reason:                   Succeeded
    Status:                   True
    Type:                     Ready
  Last Handled Reconcile At:  2022-06-20T12:51:19.602886928Z
  Observed Generation:        5
Events:                       <none>

HelmRelease

Name:         my-app
Namespace:    my-namespace
Labels:       kustomize.toolkit.fluxcd.io/name=my-namespace
              kustomize.toolkit.fluxcd.io/namespace=flux-system
Annotations:  reconcile.fluxcd.io/requestedAt: 2022-06-20T12:51:56.444579522Z
API Version:  helm.toolkit.fluxcd.io/v2beta1
Kind:         HelmRelease
Metadata:
  Creation Timestamp:  2022-06-09T15:05:42Z
  Finalizers:
    finalizers.fluxcd.io
  Generation:  5
  Managed Fields:
    API Version:  helm.toolkit.fluxcd.io/v2beta1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:labels:
          f:kustomize.toolkit.fluxcd.io/name:
          f:kustomize.toolkit.fluxcd.io/namespace:
      f:spec:
        f:chart:
          f:spec:
            f:chart:
            f:sourceRef:
              f:kind:
              f:name:
              f:namespace:
            f:version:
        f:install:
          f:remediation:
            f:retries:
        f:interval:
        f:releaseName:
        f:targetNamespace:
        f:timeout:
        f:values:
    Manager:      kustomize-controller
    Operation:    Apply
    Time:         2022-06-10T13:00:17Z
    API Version:  helm.toolkit.fluxcd.io/v2beta1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:reconcile.fluxcd.io/requestedAt:
    Manager:      flux
    Operation:    Update
    Time:         2022-06-20T12:52:19Z
    API Version:  helm.toolkit.fluxcd.io/v2beta1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:finalizers:
          .:
          v:"finalizers.fluxcd.io":
      f:status:
        f:conditions:
        f:failures:
        f:helmChart:
        f:lastAppliedRevision:
        f:lastAttemptedRevision:
        f:lastAttemptedValuesChecksum:
        f:lastHandledReconcileAt:
        f:lastReleaseRevision:
        f:observedGeneration:
    Manager:         helm-controller
    Operation:       Update
    Time:            2022-06-21T00:01:06Z
  Resource Version:  19441662
  UID:               313cd68e-e8bd-4bab-b98b-352bce3d7a64
Spec:
  Chart:
    Spec:
      Chart:               my-chart
      Reconcile Strategy:  ChartVersion
      Source Ref:
        Kind:       HelmRepository
        Name:       ecr
        Namespace:  flux-system
      Version:      0.2.0
  Install:
    Remediation:
      Retries:       3
  Interval:          1m
  Release Name:      my-app
  Target Namespace:  my-namespace
  Timeout:           10m0s
  Values:
    Image:
      Tag:  my-app-1.0.19
Status:
  Conditions:
    Last Transition Time:          2022-06-21T00:01:06Z
    Message:                       HelmChart 'flux-system/my-namespace-my-app' is not ready
    Reason:                        ArtifactFailed
    Status:                        False
    Type:                          Ready
    Last Transition Time:          2022-06-20T12:52:27Z
    Message:                       Helm upgrade succeeded
    Reason:                        UpgradeSucceeded
    Status:                        True
    Type:                          Released
  Failures:                        9337
  Helm Chart:                      flux-system/my-namespace-my-app
  Last Applied Revision:           0.2.0
  Last Attempted Revision:         0.2.0
  Last Attempted Values Checksum:  6dd0181482c19c5a1d858af61822bc1e954ac809
  Last Handled Reconcile At:       2022-06-20T12:51:56.444579522Z
  Last Release Revision:           4
  Observed Generation:             5
Events:
  Type    Reason  Age                      From             Message
  ----    ------  ----                     ----             -------
  Normal  info    3m30s (x21603 over 17d)  helm-controller  HelmChart 'flux-system/my-namespace-my-app' is not ready

Source Controller logs:

{"level":"info","ts":"2022-06-27T11:55:44.970Z","logger":"controller.gitrepository","msg":"no changes since last reconcilation: observed revision 'master/c2be202685fd4c5218d6da49d9eff23480ce7d2f'","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"GitRepository","name":"flux-system","namespace":"flux-system"}
{"level":"info","ts":"2022-06-27T11:56:11.803Z","logger":"controller.helmchart","msg":"artifact up-to-date with remote revision: '1.4.0'","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"HelmChart","name":"kube-system-aws-load-balancer-controller","namespace":"flux-system"}
{"level":"info","ts":"2022-06-27T11:56:12.217Z","logger":"controller.helmchart","msg":"artifact up-to-date with remote revision: 'v3.22.0'","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"HelmChart","name":"calico-system-calico","namespace":"flux-system"}
{"level":"info","ts":"2022-06-27T11:56:13.836Z","logger":"controller.helmchart","msg":"artifact up-to-date with remote revision: 'v2.3.1'","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"HelmChart","name":"kyverno-kyverno","namespace":"flux-system"}
{"level":"info","ts":"2022-06-27T11:56:45.205Z","logger":"controller.gitrepository","msg":"no changes since last reconcilation: observed revision 'master/c2be202685fd4c5218d6da49d9eff23480ce7d2f'","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"GitRepository","name":"flux-system","namespace":"flux-system"}
{"level":"error","ts":"2022-06-27T11:56:46.008Z","logger":"controller.helmchart","msg":"Reconciler error","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"HelmChart","name":"my-namespace-my-app","namespace":"flux-system","error":"chart pull error: chart pull error: failed to get chart version for remote reference: GET \"https://<redacted>.dkr.ecr.us-west-2.amazonaws.com/v2/helm/my-chart/tags/list\": unexpected status code 403: denied: Your authorization token has expired. Reauthenticate and try again."}
{"level":"info","ts":"2022-06-27T11:57:11.826Z","logger":"controller.helmchart","msg":"artifact up-to-date with remote revision: '1.4.0'","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"HelmChart","name":"kube-system-aws-load-balancer-controller","namespace":"flux-system"}
{"level":"info","ts":"2022-06-27T11:57:12.222Z","logger":"controller.helmchart","msg":"artifact up-to-date with remote revision: 'v3.22.0'","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"HelmChart","name":"calico-system-calico","namespace":"flux-system"}
{"level":"info","ts":"2022-06-27T11:57:13.854Z","logger":"controller.helmchart","msg":"artifact up-to-date with remote revision: 'v2.3.1'","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"HelmChart","name":"kyverno-kyverno","namespace":"flux-system"}
souleb commented 2 years ago

Thanks @nalbury we have identified the issue. Working on fixing this.

nalbury commented 2 years ago

Amazing thank you!

stefanprodan commented 2 years ago

We'll probably have to use @souleb's fork of Helm until this gets merged: https://github.com/helm/helm/pull/11086

souleb commented 2 years ago

I have tested the fix with the following scenarios

@nalbury do you have the possibility to test the fix? See #799

nalbury commented 2 years ago

Yup deployed an image built from your branch this morning. Should be able to verify this evening once the currently loaded token expires.