fluxcd / source-controller

The GitOps Toolkit source management component
https://fluxcd.io
Apache License 2.0
240 stars 187 forks source link

Chart pull error "oci://public.ecr.aws" #845

Closed michaelsatish closed 2 years ago

michaelsatish commented 2 years ago

Hi,

I am trying to have flux manage ack eks controller helm chart and ran into a "chart pull" error. It could be my helm release configuration. Any help will be much appreciated.

Helm Repository

apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmRepository
metadata:
  name: ack-eks
  namespace: ack-system
spec:
  interval: 1m0s
  url: oci://public.ecr.aws/aws-controllers-k8s
  type: "oci"
  secretRef:
    name: ecr-credentials
❯ k get helmrepository -n ack-system
NAME      URL                                        AGE   READY   STATUS
ack-eks   oci://public.ecr.aws/aws-controllers-k8s   77m   True    Helm repository is ready

Helm Release

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: ack-eks
  namespace: ack-system
spec:
  chart:
    spec:
      chart: eks-chart
      reconcileStrategy: ChartVersion
      sourceRef:
        kind: HelmRepository
        name: ack-eks
      version: v0.1.3
  interval: 1m0s
  values:
    serviceAccount: 
      name: ack-eks-sa
    aws:
      region: us-east-1
❯ k get helmchart -n ack-system
NAME                 CHART       VERSION   SOURCE KIND      SOURCE NAME   AGE   READY   STATUS
ack-system-ack-eks   eks-chart   v0.1.3    HelmRepository   ack-eks       67m   False   chart pull error: chart pull error: failed to get chart version for remote reference: unable to locate any tags in provided repository: oci://public.ecr.aws/aws-controllers-k8s/eks-chart

Deploying it manually works,

helm install ack-eks-controller oci://public.ecr.aws/aws-controllers-k8s/eks-chart --version=v0.1.3 --set=aws.region=us-east-1 -n ack-system
stefanprodan commented 2 years ago

Public ECR doesn't not allow listing tags, I think Flux tries to list the tags before pulling and fails due to https://github.com/aws/containers-roadmap/issues/1262

makkes commented 2 years ago

Details on the HC confirming above error message:

$ k -n flux-system get hc flux-system-ack-eks
NAME                  CHART       VERSION   SOURCE KIND      SOURCE NAME   AGE   READY   STATUS
flux-system-ack-eks   eks-chart   *         HelmRepository   ack           45s   False   chart pull error: chart pull error: failed to get chart version for remote reference: unable to locate any tags in provided repository: oci://public.ecr.aws/aws-controllers-k8s/eks-chart

helm-controller needs to fetch all tags to determine the exact chart version to install given that spec.chart.spec.version can be a semver range. So as long as AWS's public registry doesn't support fetching tags you can't use it with Flux.

makkes commented 2 years ago

We will at least defuse the situation by not listing tags if a specific version has been provided in the HelmRelease.

jlbutler commented 2 years ago

Public ECR doesn't not allow listing tags, I think Flux tries to list the tags before pulling and fails due to aws/containers-roadmap#1262

The fix for this is rolling out, before workarounds are pursued would be good to see if this is still an issue.

jlbutler commented 2 years ago

Public ECR doesn't not allow listing tags, I think Flux tries to list the tags before pulling and fails due to aws/containers-roadmap#1262

The fix for this is rolling out, before workarounds are pursued would be good to see if this is still an issue.

Apparently it wasn't yet ready, apologies. It is still set for coming soon, sorry for the noise.

michaelsatish commented 2 years ago

@makkes Thank you for the fix. I updated to the new version of the source controller v0.25.11, and it resolved my issue.

❯ k get helmchart -n ack-system
NAME                 CHART       VERSION   SOURCE KIND      SOURCE NAME   AGE     READY   STATUS
ack-system-ack-eks   eks-chart   v0.1.3    HelmRepository   ack-eks       7m46s   True    pulled 'eks-chart' chart with version 'v0.1.3'

❯ k get helmrelease -n ack-system
NAME      AGE     READY   STATUS
ack-eks   7m56s   True    Release reconciliation succeeded
makkes commented 2 years ago

I'm happy to hear that, @michaelsatish and I'm going to close this issue now as there's nothing else in Flux we can do apart from waiting for AWS to support the tag listing API.

makkes commented 2 years ago

Thanks for keeping us up-to-date on the progress @jlbutler! Much appreciated.

jlbutler commented 2 years ago

@makkes @stefanprodan and others... dropped a comment on our issue tracking tags missing from ECR Public's v2 API. We're very interested in getting this addressed so that projects don't need to work around it.

TL;DR question for you - if we iterate toward full OCI compliance, are you ok for the Flux source-controller use case with us deferring the lexical sorting and transparent pagination token aspect of the spec, to focus on a first delivery that is the correct response body?

If you have the time, can you reply here or on the roadmap ticket.

Thanks for your time!

makkes commented 2 years ago

TL;DR question for you - if we iterate toward full OCI compliance, are you ok for the Flux source-controller use case with us deferring the lexical sorting and transparent pagination token aspect of the spec, to focus on a first delivery that is the correct response body?

If you have the time, can you reply here or on the roadmap ticket.

Hey @jlbutler, thanks for coming back here. Order doesn't matter for us as source-controller sorts the tags by itself. ~As for pagination I'm not sure I understand how you would implement that. Would there just be no pagination at all?~ Ok, I scrolled through 1262 and understand the the ECR API would return a nextToken in the respose body. If that's correct than this would not work with source-controller as we're using oras-go for registry interaction and the only pagination that lib supports is via the specified Link HTTP header.

jlbutler commented 2 years ago

TL;DR question for you - if we iterate toward full OCI compliance, are you ok for the Flux source-controller use case with us deferring the lexical sorting and transparent pagination token aspect of the spec, to focus on a first delivery that is the correct response body? If you have the time, can you reply here or on the roadmap ticket.

Hey @jlbutler, thanks for coming back here. Order doesn't matter for us as source-controller sorts the tags by itself. ~As for pagination I'm not sure I understand how you would implement that. Would there just be no pagination at all?~ Ok, I scrolled through 1262 and understand the the ECR API would return a nextToken in the respose body. If that's correct than this would not work with source-controller as we're using oras-go for registry interaction and the only pagination that lib supports is via the specified Link HTTP header.

Hi @makkes thanks for taking the time to respond!

Our first release would not include a nextToken in the body - the body will be fully compliant with the spec. This is to say, the response body will be formatted as below without additional elements:

{
  "name": "<name>",
  "tags": [
    "<tag1>",
    "<tag2>",
    "<tag3>",
    ..
  ]
}

If it works for most use cases, we will use the Link header for pagination, but the value will be an opaque token embedded in a full URL. So the pagination will not be fully spec-compliant, but should still work with clients that adhere to RFC5988. And it'll be moot for clients that don't need pagination. It won't interfere with the expected response body per the spec. This is how ECR works for private repos today.

Glad the sorting isn't a sticking point. Does the rest sound reasonable?

makkes commented 2 years ago

As long as the Link header has a format like this

link: </v2/fluxcd/helm-controller/tags/list?last=v0.0.8&n=2>SOMETHING ELSE

it will work from what I can tell by looking at the oras-go code. Oras merely looks at what's inside the two <...> markers.

stefanprodan commented 2 years ago

We use oras only for Helm, while Flux image automation and source controller’ OCIRepository are using https://github.com/google/go-containerregistry/blob/main/pkg/v1/remote/list.go

jlbutler commented 2 years ago

Ok so I think the proposal will unstick our use case here, and you don't have to back-bend around it. Thank you so much for the feedback, super appreciate it!