goharbor / harbor

An open source trusted cloud native registry project that stores, signs, and scans content.
https://goharbor.io
Apache License 2.0
24.29k stars 4.77k forks source link

Not able to use AWS IAM role in harbor registry with pre-release v2.5.0-rc1 for AWS S3 backend #16490

Open davidg-sainsbury opened 2 years ago

davidg-sainsbury commented 2 years ago

Expected behavior and actual behavior: Should be able to replicate the harbour images to an AWS S3 storage bucket. As per this PR https://github.com/goharbor/harbor/pull/16435 which is included in pre-release v2.5.0-rc1 there should now be support for IAM roles for Service Accounts in AWS EKS. This does now work for harbor chartmuseum but we still have an issue with harbor registry (see error log below).

Steps to reproduce the problem: Create a replication rule to pull from AWS ECR with a destination endpoint of an S3 bucket. Fails to write the images due to an AWS credentials issue. The harbor registry pod has the correct service account mounted with the correct IAM privileges to access and write to the S3 bucket.

Versions:

goharbor/registry-photon:v2.5.0-rc1 goharbor/harbor-registryctl:v2.5.0-rc1

Additional context: harbor helm values:

persistence: imageChartStorage: type: s3 s3: bucket: my-aws-s3-bucket region: eu-west-1

Error message in harbor registry pod when trying to replicate image to an AWS S3 backend using a service account and IAM policy:

level=info msg="authorized request" go.version=go1.17.7 http.request.host="harbor-core:80" http.request.id=ea5f89c9-677e-4dee-ae0a-4f9a60323086 http.request.method=HEAD http.request.remoteaddr=100.64.248.123  │
│ time="2022-03-08T21:38:53.230230565Z" level=error msg="response completed with error" auth.user.name=admin err.code=unknown err.detail="s3aws: NoCredentialProviders: no valid providers in chain. Deprecated.
"HEAD /v2/dce/test-bats-2ef5098/blobs/sha256:0bd848d9f7a107ab36d5c05f0c881b3edfc39f9a239f018f8013cb685c54ed5b HTTP/1.1" 500 104 "" "harbor-registry-client"
stonezdj commented 2 years ago

It seems that distribution v2.8 supports this feature, the Harbor replication needs to change for this feature

sirisaacnuketon commented 2 years ago

I'm running Harbor v2.5.0 and still unable to use S3 backed storage for the registry when assuming an IAM role.

err.message="unknown error" go.version=go1.17.7 http.request.host="harbor-core:80" http.request.i
d=7c0c228a-dd85-4876-a2ae-9d49834dd81d http.request.method=HEAD http.request.remoteaddr=<redacted> http.request.uri="/v2/test/test-app-18a0a1c/manifests/sha256:3989562b3ab467
7cbcc1b44d48b0c947d2e4927b0e9d997f6730842792b173f6" http.request.useragent=harbor-registry-client http.response.contenttype="application/json; charset=utf-8" http.response.durati
on=75.913366ms http.response.status=500 http.response.written=104 vars.name="test/test-app-18a0a1c" vars.reference="sha256:3989562b3ab4677cbcc1b44d48b0c947d2e4927b0e9d997f6730842
792b173f6"

Replication logs:

2022-05-05T10:40:57Z [INFO] [/controller/replication/transfer/image/transfer.go:343]: pulling the manifest of artifact test/demo-deployment:8a5820a ...
2022-05-05T10:40:57Z [INFO] [/controller/replication/transfer/image/transfer.go:349]: the manifest of artifact test/demo-deployment:8a5820a pulled
2022-05-05T10:40:57Z [ERROR] [/controller/replication/transfer/image/transfer.go:357]: failed to check the existence of the manifest of artifact test/demo-deployment:8a5820a on the destination registry: http status code: 500, body: 

If I turn S3 back off this starts working again. The IAM Role linked to the SA is definitely authorised to use S3.

OrlinVasilev commented 2 years ago

@MinerYang @wy65701436 @YangJiao0817 can you comment please?

wazoo commented 2 years ago

I am still running into this issue using 2.6.0, it seems that the SDK can't find the credentials mounted in the registry pod for whatever reason.

I am getting the following in the registry logs:

err.detail="s3aws: NoCredentialProviders: no valid providers in chain. Deprecated. For verbose messaging see aws.Config.CredentialsChainVerboseErrors"

I am not sure if this is just an old AWS SDK version or something else, I have tried re-deploying the pods and verified that the token is mounted in the pod at the standard /var/run/secrets/eks.amazonaws.com/serviceaccount/token location and the role has permission to access the S3 bucket so I am not sure what is up.

schmitz-chris commented 2 years ago

I have the same problem as the previous speakers described. Is there any news yet?

daemon-ian commented 2 years ago

Hi @MinerYang @wy65701436 @YangJiao0817, quite a few people seem to be seeing this issue now. Can you share an update please?

OrlinVasilev commented 2 years ago

Hi @MinerYang @wy65701436 @YangJiao0817 @Vad1mo @goharbor/all-maintainers anyone can provide feedback on the issue ? Thank you!

Vad1mo commented 2 years ago

This is a duplicate of https://github.com/goharbor/harbor/issues/12888 see my note https://github.com/goharbor/harbor/issues/12888#issuecomment-1087582072

OrlinVasilev commented 2 years ago

@Vad1mo why did you reopen it ? :)

Vad1mo commented 2 years ago

I am still unsure if we should keep it open or only close it once it is resolved upstream. Given the velocity of Docker Distribution this won't happen within the next 90 days.

savon-noir commented 2 years ago

it looks like it is fixed in the main branch of distribution github repo. Any plan on a release supporting irsa? Thanks!

z0rc commented 2 years ago

Any plan on a release supporting irsa?

I believe, no. Support of IRSA isn't available in any of distribution releases. It's present only in main branch. It isn't feasible for harbor to depend on some arbitrary git sha from distrubution repo. Feel free to address this to distribution, so they actually release next version with this feature.

slushysnowman commented 1 year ago

Just to link this from here... https://github.com/distribution/distribution/issues/3756

Would be really nice if they cut a release so we could get this sorted.

wy65701436 commented 1 year ago

Since harbor leverages the upstream distribution to interact with the AWS, we will keep eyes on the upstream release, as soon as a newer version becomes available, Harbor will bump.

chrisminton commented 1 year ago

as per the distribution discussion:

The TL;DR is: 2.8 is so far behind main that keeping it up to date with main or even close to it 
is a colossal amount of work, not to mention that it's not even using go modules.
We made a decision that 2.8 will only receive security patches from then on 
and most of the work should focus on v3 release.

@wy65701436 is it at all possible to pin to an edge tag with the aws-sdk-go updates included? Distribution will never push this update (until v3.0, but who knows when that will be ready)

Jasper-Ben commented 1 year ago

We just ran into this when setting up harbor. Quite a bummer to be honest. Given the mentioned comment from the discussion, I second the idea of pinning an edge tag for distribution.

daemon-ian commented 1 year ago

Hi @wy65701436, any update on if this is possible? This would really help us move through this long standing blocker!

Jasper-Ben commented 1 year ago

Alternatively to using a distribution edge tag, a less invasive approach could be to use the 2.8.1 distribution source code and only update the aws-sdk-go package to at least 1.23.13 (according to AWS docs) via a patch file. Something similar is already done for redis. Not sure how much pain that would be, e.g. due to possible breaking changes in the aws-go-sdk, but might be worth a shot.

Vad1mo commented 1 year ago

We already apply patches from upstream docker distribution, but only if they are merged but not yet released.
We can do the same in this case. Who is up for a PR?

It's easy, see this PR for reference https://github.com/goharbor/harbor/pull/16322

Jasper-Ben commented 1 year ago

I already gave it a shot, so far unfortunately unsuccessful. I must admit, I am not too familiar with how golang handles dependencies, nor with the harbor build setup. The change from a vendor.conf to golang modules within distribution doesn't make it easier.

What I tried so far:

  1. For test purposes, I created a fork of distribution and pushed a tag to my fork. This tag changes the aws-sdk-go version in distributions vendor.conf to reflect the one on distributions main branch. Taking harbor v2.7.1 as a base, I modified the Makefile to use my fork and the test tag and built the harbor containers. After giving it a test run the issue was still persisting, so (assuming I did everything correct and adjusting vendor.conf suffices) it appears the issue is not only due to an outdated aws-sdk.
  2. As I am not quite sure where distribution fixed IAM authentication (if anyone knows more, please link me the relevant PR(s) or commit(s)) I thought I'd try to naively updating just the registry/storage/driver/s3-aws path in distribution to reflect the main branch, however there where also changes to the imports within these files, which caused the build to fail.

I'll be gone on vacation for the next week, if no-one else picks up this task until then and depending on my workload, I'll probably resume working on this on the 3rd (and bring along a colleague with more golang experience than I have :sweat_smile:).

Jasper-Ben commented 1 year ago

P.S.: just did some more issues digging. It appears that, contrary to what the discussion in the linked issue led me to believe, the current distribution edge release actually does not support IRSA (anymore?) according to https://github.com/distribution/distribution/issues/3275#issuecomment-1378836386. So if anyone wants to tackle this, you would probably want to look at getting it working on distribution edge first, before attempting to backport support to 2.8.1 via patch file within harbor.

Tejuvmware commented 1 year ago

Same Issue has been raised here https://github.com/goharbor/harbor/issues/18699. Closed as duplicate.

Hello Team,

Given that the AWS SDK supports assuming a role , pods running in EKS/GKE with the storage target as AWS S3 should be able to assume a role to connect to the S3 buckets.

Example or Brief can be found here : https://confluence.eng.vmware.com/display/public/AEAV/Service+User+Model

Versions: Please specify the versions of following systems.

harbor version: [2.3.3] (via helm chart) kubernetes 1.20.6 Cluster : GKE Storage : AWS S3

Expected behavior and actual behavior:

It would be great if this feature request can be prioritised as

BLOKER : Currently It is a Hard blocker from Harbor for us to get TanzuNet Production AWS accounts to get on-boarded with VMware CloudGate. Having this feature request implemented resolves our blocker.

Let me know if any further details are required.

Thanks,

GowriRegistry commented 1 year ago

To add weightage, TanzuNetwork Production has high dependency/blocker on Assume Role feature, based on organisation legal demands. There is zero tolerance observed currently and its important for the team to maintain the production platform to be compliant https://github.com/goharbor/harbor/issues/18699 has been closed as duplicate to track here.

peterellisjones commented 10 months ago

To add another datapoint: lack of assume role support is making it very difficult for us to use Harbor in a high security compliance environment in the financial services industry. It's not quite a total blocker but it's certainly costing a lot of effort in terms of security/compliance exception management :)

rwd5213 commented 8 months ago

In case anyone cares, this issue seems to be the same if using IMDSv2 when running harbor on EC2 instances and trying to use the instance role. Increased the hops to 4 and it still wasnt working until i set http-tokens to optional

serverbaboon commented 8 months ago

2nd that , enforced IMDSv2 to meet Security standards (NIST) and Harbor cannot access the S3 bucket by Assume Role anymore. I will stop reading the release notes hoping for a fix then.

rwd5213 commented 7 months ago

Anyone have any update on this. Looks like we are waiting for distribution v3 to be released?

gsharma-jiggzy commented 4 months ago

Anyone have any update on this. Looks like we are waiting for distribution v3 to be released? looks like the beta.1 dropped for distribution, not sure how long it'll before the v.3.0.0 release to drop.

https://github.com/distribution/distribution/releases/tag/v3.0.0-beta.1

stdmje commented 2 months ago

Any news?