fluxcd / flux

Successor: https://github.com/fluxcd/flux2
https://fluxcd.io
Apache License 2.0
6.9k stars 1.08k forks source link

Error of fetching image metadata from AWS ECR(China region) #2954

Closed terrificdm closed 4 years ago

terrificdm commented 4 years ago

My EKS is running on AWS China region(cn-north-1) with version 1.15, and I have successfully set up Flux using Helm install. The Flux works well and it could access Git with deploy key and create/update applications thru configurations stored in Git repo. The issue was that Flux cannot fetch app's image metadata from ECR, it turned out that when the newer image was created in ECR repo, the Flux couldn't update configurations accordingly in Git repo then update the applications.

Logs as below:

ts=2020-03-31T18:23:19.267382623Z caller=images.go:159 component=sync-loop err="fetching image metadata for 634072196806.dkr.ecr.cn-north-1.amazonaws.com.cn/gitops-app: item not in cache, last error: Get https://634072196806.dkr.ecr.cn-north-1.amazonaws.com.cn/v2/gitops-app/tags/list: no basic auth credentials" ts=2020-03-31T18:23:23.195082293Z caller=warming.go:180 component=warmer canonical_name=634072196806.dkr.ecr.cn-north-1.amazonaws.com.cn/gitops-app auth={map[]} err="requesting tags: Get https://634072196806.dkr.ecr.cn-north-1.amazonaws.com.cn/v2/gitops-app/tags/list: no basic auth credentials"

I have checked my EKS node's IAM role, ECR ReadOnly policy was attached properly. And I have even attached ECR FullyAccess policy, but it didn't work. I have also tried IRSA, bonded an IAM role for Flux service account with ECR FullyAccess policy, but still failed.

The weird thing is that I installed EKS+Flux in region us-east-1 with the same configurations in China region, the Flux works very well with EKS&ECR, no issue came out...

Can anyone help me on solving this issue? And can I check or configure Flux's authentication directly for accessing AWS ECR?

Thanks a lot.

terrificdm commented 4 years ago

Just checked the source code of "aws.go" https://github.com/fluxcd/flux/blob/master/pkg/registry/aws.go in Flux repo, I noticed that one of const ecrHostSuffix is set to ".amazonaws.com", but for AWS China region the ECR registry URLs should looks like "account-id.dkr.ecr.region.amazonaws.com.cn" according to AWS documents https://docs.amazonaws.cn/en_us/aws/latest/userguide/endpoints-Beijing.html, https://docs.amazonaws.cn/en_us/aws/latest/userguide/ecr.html and https://github.com/kubernetes/kubernetes/blob/master/pkg/credentialprovider/aws/aws_credentials.go. So I guess maybe this the main cause for this issue. But I am not 100% sure whether I am right.