aws-samples / amazon-eks-ami-rhel

This is a Red Hat Enterprise Linux specific forked version of the official awslabs amazon-eks-ami repository.
https://aws-samples.github.io/amazon-eks-ami-rhel/
MIT No Attribution
13 stars 9 forks source link

FIPS ECR endpoints not available in non-US regions #9

Open gpshipley opened 1 week ago

gpshipley commented 1 week ago

Nodeadm sets up containerd with a FIPS ecr endpoint even when not in a US region.

Deploying in "eu-west-2" results in containerd attempting to pull the sandbox container from 602401143452.dkr.ecr-fips.eu-west-2.amazonaws.com which doesn't exist. Only the non-FIPS variant exists for this account.

The error logs show the following {"level":"fatal","ts":1725887644.1792407,"caller":"nodeadm/main.go:36","msg":"Command failed","error":"lookup 602401143452.dkr.ecr-fips.eu-west-2.amazonaws.com on <IP.IP.IP.IP>:53: no such host","stacktrace":"main.main\n\t/workdir/cmd/nodeadm/main.go:36\nruntime.main\n\t/root/sdk/go1.22.6/src/runtime/proc.go:271"}

Cant find any options to override this registry URL in the nodeadm arguments or config file schema.

The only FIPS endpoints are listed here https://aws.amazon.com/compliance/fips/ and this doesnt include non-US regions

bradwatsonaws commented 6 days ago

Hi @gpshipley ! My guess is that the nodeadm GetEKSRegistry function is returning the URL with a FIPS endpoint based on the result of the GetFipsInfo function, which checks the contents of the file /proc/sys/crypto/fips_enabled. Can you check that file on one or your worker nodes and post the result? Can you also ensure you have the enable_fips parameter set to false in your configuration?

All that said - the nodeadm code is maintained as part of the awslabs amazon-eks-ami repo. The nodeadm code is untouched as much as possible within this repo. I did just run a sync to make sure the nodeadm code was up to date with that repo so you can try it again to see if the problem is resolved. If not, this issue might need to be submitted in that repo if this is found to be a bug.

gpshipley commented 5 days ago

Thanks. We do have FIPS enabled, this is by design as we want the extra security controls. The problem is that the Registry URLs being generated by the GetEKSRegistry function are invalid i.e. they dont exist.

A glance at the code seems to suggest that URL generation is pretty "dumb", just concatenating account number with a hard-coded "ecr-fips" domain path. However there also seems to be some checks that should error if the registry doesn't exist, but this isn't doing so.

My current workaround is to simply change the hard coded "ecr-fips" to just "ecr" in the ecr.go file before running a Make.

I will run another pass with the re-sync and see if the issue persists anyway.

bradwatsonaws commented 5 days ago

That all makes sense. This definitely looks like something that should be fixed in the upstream awslabs amazon-eks-ami repo. Would you mind opening an issue there stating that the FIPS endpoint URLs should only be built in the US, GovCloud, and Canada regions even when FIPS is enabled at the OS level based on the FIPS Endpoint documentation? I might take a crack at fixing the code myself there and submitting a PR for it. Once it is fixed there, I will sync the code in this repository as well.

Again, we try to only change RHEL specific things in this repository and keep everything else inline with that upstream repo.