Open philippmoehler0440 opened 5 years ago
all my docker images have an ARG for the account id, so i can and do easily replace it to point to different accounts
all my docker images have an ARG for the account id, so i can and do easily replace it to point to different accounts
This is fine if you are the only consumer - but with dependencies to around 12 other teams, you would always have to share this, even this could be solved more easily.
This is a great idea, and it's something we've planned for as we made other networking changes such as VPC Endpoint support. We see a bunch of additional benefits. For example, your developers will be able to use a friendly URI like "repo.mycompany.com" instead of having to remember an AWS account number. Also, if you run your own registry today, and you want to switch to ECR so that you don't have to manage (upgrade, monitor, scale, etc.) it, then DNS might help with the transition.
We are interested in hearing more about how customers would like to manage SSL certificates and DNS. Would you use AWS Certificate Manager (ACM) for certs? Would you create a Route 53 hosted zone for the subdomain?
@jtoberon ACM for sure. We already have hosted zones on route53
@jtoberon yes we would use ACM and different hosted zones.
@jtoberon
if you run your own registry today, and you want to switch to ECR ...., then DNS might help with the transition.
^ This is exactly the scenario we are in. If ECR supported custom DNS the switch would be relatively painless. Without custom DNS, there are a number of pain points:
And all of that is on top of the wacky authentication requirements for ECR. Y'all are not helping folks with established (but standard) authentication workflows or existing registries. The pain level goes up with the scale of the established operation. But those big registry users also seem like they would be the juicier targets for y'all, right?
@okor Currently, we're tentatively planning to work on this after cross region replication. What established authentication workflows do you have in mind?
When is the work going to start on this? Interested to contribute to make this live. ✋
Have to dog pile on this one.
I too have wanted this to be officially supported for awhile.
It is possible with an NGinx proxy.
Or API gateway + lambda
NOTE: you can't use the standard docker credential helper
however, it has a regex that expects the default repo URIs
It looks like this was already requested back in early 2016 here: https://forums.aws.amazon.com/thread.jspa?threadID=223934&start=25&tstart=0
But unfortunately, the team at Amazon have been very quiet about when we can expect a fix for this.
+1. We'd like to use ACM for the cert, but probably not route53 for DNS and instead cloudflare (simply because that's what we already do for the domain).
+850
+1
This would definitely be very useful and would save our repositories and documentations from getting cluttered with a long ECR URL that has an account number in it.
+1
+1
+1
(Purposefully not leaving a reaction, as I want to get notified when this is updated.)
You can subscribe to the issue to receive notifications instead of commenting.
+1
+1
+1
Putting a CloudFront distribution in front of ECR should work fine, right?
Almost 2 years have passed. Is there any progress being made?
Progress in your life or mine?
2021년 6월 29일 (화) 오후 8:03, Jim Brännlund @.***>님이 작성:
Almost 2 years have passed. Is there any progress being made?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/aws/containers-roadmap/issues/299#issuecomment-870497053, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIJ4YQEOHK27MONER47L64DTVGSB5ANCNFSM4HN7TGDA .
-- [image: photo] Charlie Park Founder, Komachine A Yongin City, Gyeonggido, South Korea (17015) O +82-31-335-9901 <+82-31-335-9901> M +82-10-8218-7270 <+82-10-8218-7270> E @. @.> W www.komachine.com
I'll ask again: What's the issue with using cloudfront using ECR as an origin?
I'll ask again: What's the issue with using cloudfront using ECR as an origin?
Spinning up a non-trivial piece of infrastructure to use 5% of its functionality is not an answer.
Spinning up a non-trivial piece of infrastructure to use 5% of its functionality is not an answer.
There's nothing to spin-up, cloudfront is a hosted service. And one of its main features is exactly what's being asked here.
If AWS added support for custom-domains for ECR registries, I can't image it'd be much less work than configuring cloudfront anyway -- you'd still have to address things like provisioning ACM certificates and creating Route53 records. There's not much more than 30-60 minutes of work here.
I'm not sure what you're expecting: it sound like all the tools are right there and what you actually need is someone to set them up for you.
There's nothing to spin-up, cloudfront is a hosted service. And one of its main features is exactly what's being asked here.
If you want caching and geo-distribution sure. But if all you want is a domain, you're spinning up service with non-trivial deployment times and non-zero costs just to get a domain.
you'd still have to address things like provisioning ACM certificates and creating Route53 records
Sure, but that's fine and probably something I'm already doing if I'm asking for custom domains. Cloudfront is not as a given.
It's also worth calling out the OP, which says "One way we had seen to "solve" this, is to use an nginx as reverse proxy for the ECR, but this is an effort we don't want to practice.
" so you're proposing one moving part for an (albiet much easier to manage) other moving part.
Has anyone actually managed to get CloudFront working in-front of ECR? It's mostly working for me in that I can login using an ecr login password and I can pull images, but when trying to push images it seems as though it gets half-way and then fails with a message saying unauthorized: authentication required
, even though I can successfully pull an image straight after.
Haven't tried with CF no.. We are using a pattern, as suggested above, with a private registry behind a Nginx proxy/forwarder running as a Fargate service fronted by an HTTPS ALB
This allows us custom fqdn for our ECR which seems to be working great so far for docker auth/pull/push ops etc.
It is actually pretty neat and tidy to orchestrate and deploy and obviously if one desires to really go all out we could tie the fargate task to some cw metrics/alarms and targets with an ASG to control load demand, but for now we are happy to set the container replica count statically...
Anyway.. apologies, I know I haven't really answered your CF question but I wondered if you were keen on exploring an alternative
@julienbonastre seeing as how you have done this in NGINX, are you able to tag, push, and pull using the custom domain name?
For example:
docker build -t ecr.mydomain.com/my-project/my-image:latest
docker push ecr.mydomain.com/my-project/my-image:latest
docker pull ecr.mydomain.com/my-project/my-image:latest
I guess I'm just asking whether ECR will play nicely if the domain doesn't match the long ECR domain.
Sorry, I should have provided some context..
Here is an trimmed excerpt of the nginx conf I am using in the nginx container task which runs as an ECS Fargate service (scaled to 2 replicas)
server {
listen 443 ssl http2;
listen [::]:443 ssl http2;
ssl_certificate /etc/ssl/certs/nginx-selfsigned.crt;
ssl_certificate_key /etc/ssl/private/nginx-selfsigned.key;
ssl_dhparam /etc/ssl/certs/dhparam.pem;
# ssl_session_cache shared:SSL:1m;
# ssl_session_timeout 10m;
chunked_transfer_encoding on;
client_max_body_size 0;
server_name _;
########################################################################
# from https://cipherli.st/ #
# and https://raymii.org/s/tutorials/Strong_SSL_Security_On_nginx.html #
########################################################################
ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
ssl_prefer_server_ciphers on;
ssl_ciphers "EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH";
ssl_ecdh_curve secp384r1;
ssl_session_cache shared:SSL:10m;
ssl_session_tickets off;
ssl_stapling on;
ssl_stapling_verify on;
resolver 8.8.8.8 8.8.4.4 valid=300s;
resolver_timeout 5s;
# Disable preloading HSTS for now. You can use the commented out header line that includes
# the "preload" directive if you understand the implications.
#add_header Strict-Transport-Security "max-age=63072000; includeSubdomains; preload";
add_header Strict-Transport-Security "max-age=63072000; includeSubdomains";
add_header X-Frame-Options DENY;
add_header X-Content-Type-Options nosniff;
##################################
# END https://cipherli.st/ BLOCK #
##################################
location / {
proxy_pass https://<aws acct id>.dkr.ecr.ap-southeast-2.amazonaws.com;
proxy_set_header Host "<aws acct id>.dkr.ecr.ap-southeast-2.amazonaws.com";
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto "https";
proxy_read_timeout 900;
}
}
Yes, I am using a self-signed cert generated on the nginx container itself during init that is referred to in the nginx.conf
ARG ECR_FQDN=ecr.mydomain.com
ARG BASE_NGINX_IMAGE=nginx:latest
FROM ${BASE_NGINX_IMAGE}
RUN mkdir -p /etc/ssl/private
RUN chmod 700 /etc/ssl/private
RUN openssl req -x509 -nodes -days 365 \
-newkey rsa:2048 \
-keyout /etc/ssl/private/nginx-selfsigned.key \
-out /etc/ssl/certs/nginx-selfsigned.crt \
-subj "/C=AU/ST=NA/L=NA/O=MyOrganisationName/CN=${ECR_FQDN}"
RUN openssl dhparam -out /etc/ssl/certs/dhparam.pem 2048
COPY ./nginx.conf /etc/nginx/nginx.conf
EXPOSE 80 443
Note I am referencing my desired target ECR_FQDN within the dockerfile as a buildarg and then generating the self-signed cert based off this for the SAN/subject..
However, I have actually realised it seems to not matter what fqdn is used as I recently have tested accessing the nginx proxy with different ones and it all still worked fine..
So, in summary, I have an ALB listening on https using a real ACM SSL cert with fqdn such as
So nginx is making the https calls to the aws ECR private registry as a proxy from the HTTPS calls to the ALB...
I am no security expert but this looks like a fully SSL chained request through each hop to the target ECR and from client and works a treat for us to do all the above.
So far anyway I haven't had/encountered any issues..
Obviously there is an inherent dependency here now on the availability/throughput of the ecs-fargate-nginx-proxy task, however being a fargate task this can easily be scaled to multiple fixed replicas or tied to an ASG/CW event trigger to scale up/down on demand metrics etc as desired of course to make sure the proxy can handle your workloads..
HTH
Sorry, I should have provided some context..
Here is an trimmed excerpt of the nginx conf I am using in the nginx container task which runs as an ECS Fargate service (scaled to 2 replicas)
server { listen 443 ssl http2; listen [::]:443 ssl http2; ssl_certificate /etc/ssl/certs/nginx-selfsigned.crt; ssl_certificate_key /etc/ssl/private/nginx-selfsigned.key; ssl_dhparam /etc/ssl/certs/dhparam.pem; # ssl_session_cache shared:SSL:1m; # ssl_session_timeout 10m; chunked_transfer_encoding on; client_max_body_size 0; server_name _; ######################################################################## # from https://cipherli.st/ # # and https://raymii.org/s/tutorials/Strong_SSL_Security_On_nginx.html # ######################################################################## ssl_protocols TLSv1 TLSv1.1 TLSv1.2; ssl_prefer_server_ciphers on; ssl_ciphers "EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH"; ssl_ecdh_curve secp384r1; ssl_session_cache shared:SSL:10m; ssl_session_tickets off; ssl_stapling on; ssl_stapling_verify on; resolver 8.8.8.8 8.8.4.4 valid=300s; resolver_timeout 5s; # Disable preloading HSTS for now. You can use the commented out header line that includes # the "preload" directive if you understand the implications. #add_header Strict-Transport-Security "max-age=63072000; includeSubdomains; preload"; add_header Strict-Transport-Security "max-age=63072000; includeSubdomains"; add_header X-Frame-Options DENY; add_header X-Content-Type-Options nosniff; ################################## # END https://cipherli.st/ BLOCK # ################################## location / { proxy_pass https://<aws acct id>.dkr.ecr.ap-southeast-2.amazonaws.com; proxy_set_header Host "<aws acct id>.dkr.ecr.ap-southeast-2.amazonaws.com"; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto "https"; proxy_read_timeout 900; } }
Yes, I am using a self-signed cert generated on the nginx container itself during init that is referred to in the nginx.conf
ARG ECR_FQDN=ecr.mydomain.com ARG BASE_NGINX_IMAGE=nginx:latest FROM ${BASE_NGINX_IMAGE} RUN mkdir -p /etc/ssl/private RUN chmod 700 /etc/ssl/private RUN openssl req -x509 -nodes -days 365 \ -newkey rsa:2048 \ -keyout /etc/ssl/private/nginx-selfsigned.key \ -out /etc/ssl/certs/nginx-selfsigned.crt \ -subj "/C=AU/ST=NA/L=NA/O=MyOrganisationName/CN=${ECR_FQDN}" RUN openssl dhparam -out /etc/ssl/certs/dhparam.pem 2048 COPY ./nginx.conf /etc/nginx/nginx.conf EXPOSE 80 443
Note I am referencing my desired target ECR_FQDN within the dockerfile as a buildarg and then generating the self-signed cert based off this for the SAN/subject..
However, I have actually realised it seems to not matter what fqdn is used as I recently have tested accessing the nginx proxy with different ones and it all still worked fine..
So, in summary, I have an ALB listening on https using a real ACM SSL cert with fqdn such as
assigned to the ALB and the listener group target is setup with the ecs fargate cluster task build as a HTTPS forwarding group to the registered IP targets for the ecs fargate tasks (which of course is auto managed by fargate/ecs service). So nginx is making the https calls to the aws ECR private registry as a proxy from the HTTPS calls to the ALB...
I am no security expert but this looks like a fully SSL chained request through each hop to the target ECR and from client and works a treat for us to do all the above.
So far anyway I haven't had/encountered any issues..
Obviously there is an inherent dependency here now on the availability/throughput of the ecs-fargate-nginx-proxy task, however being a fargate task this can easily be scaled to multiple fixed replicas or tied to an ASG/CW event trigger to scale up/down on demand metrics etc as desired of course to make sure the proxy can handle your workloads..
HTH
Quite a bit off-topic, but if you only have the LB accessing your nginx, you can use a much nicer, secure, smaller nginx config Here's the config from Mozilla https://ssl-config.mozilla.org/#server=nginx&version=1.17.7&config=modern&openssl=1.1.1d&hsts=false&ocsp=false&guideline=5.6 You may have to set tls 1.2 tho.
Alternatively, you can use caddy server as the reverse proxy, since it is far more modern than nginx and cloud native.
Thanks @FernandoMiguel for the tip, yes I'll definitely look into this, I too didn't like this extra "SSL" configs but I found these under some recommendation in the community, however you are correct, that recommendation is >6 years old now so definitely not cool..
I will certainly checkout caddy, as you say, this is a very very simple use-case for a rev proxy so the lighter and more modern the better! 🚀 😎
One thing I just thought of, how would custom domains in ECR mesh with pulling images in ECS or EKS? Would it just work or is there detection on the repository URL that allows the IAM roles to pull images?
One thing I just thought of, how would custom domains in ECR mesh with pulling images in ECS or EKS? Would it just work or is there detection on the repository URL that allows the IAM roles to pull images?
@blowfishpro https://docs.aws.amazon.com/AmazonECR/latest/userguide/registry_auth.html
By design, AWS ECR auth token issued for the requesting principal will have all grants/permissions for access as per their IAM policies define.
https://docs.aws.amazon.com/AmazonECR/latest/userguide/registry_auth.html
By design, AWS ECR auth token issued for the requesting principal will have all grants/permissions for access as per their IAM policies define.
Right, so naturally the tokens will still work. What I'm wondering is whether ECS/EKS look at the registry url to know whether they should even request a token from ECR when pulling an image.
I haven't had chance to try this yet but I'm almost certain it won't work for EKS as cleanly as it does with a regular ECR repo URL. The AWS docs specify that:
When referencing an image from Amazon ECR, you must use the full registry/repository:tag naming for the image. For example, aws_account_id.dkr.ecr.region.amazonaws.com/my-repository:latest
I think you'll need to add your custom domain ECR repo as a private repo to EKS and use the regular Kubernetes imagePullSecrets
feature.
I've setup the following:
$ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com
docker.mycompany.com
When I attempt to login, it fails:
$ aws ecr get-login-password | docker login -u AWS --password-stdin docker.mycompany.com
Error response from daemon: login attempt to https://docker.mycompany.com/v2/ failed with status: 401 Unauthorized
My ACM certificate is valid, and it does appear that Route 53 is working as well. When I curl
the registry:
$ curl -isSL https://docker.mycompany.com
HTTP/2 401
content-type: text/plain; charset=utf-8
content-length: 15
docker-distribution-api-version: registry/2.0
www-authenticate: Basic realm="https://$ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/",service="ecr.amazonaws.com"
date: Wed, 15 Sep 2021 19:42:46 GMT
x-cache: Error from cloudfront
via: 1.1 9e50af49c68f20e188890e7945ad09a2.cloudfront.net (CloudFront)
x-amz-cf-pop: LAX50-C3
x-amz-cf-id: 247mkimCM2ZnD5fptIlqTAINpC5FSpDAIjJLw_wvrhL1xXblGHOtCQ==
Not Authorized
My Cloudfront configuration in Terraform:
resource aws_cloudfront_distribution default {
enabled = true
retain_on_delete = true
comment = "ECR Docker Registry front-end."
aliases = ["docker.mycompany.com"]
is_ipv6_enabled = true
http_version = "http2"
default_root_object = "index.html"
price_class = "PriceClass_100"
origin {
origin_id = "ecr-us-east-1"
domain_name = "$ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com"
custom_origin_config {
http_port = 80
https_port = 443
origin_protocol_policy = "https-only"
origin_ssl_protocols = ["TLSv1.2"]
}
}
default_cache_behavior {
target_origin_id = "ecr-us-east-1"
min_ttl = 0
default_ttl = 0
max_ttl = 86400
compress = true
allowed_methods = ["GET", "HEAD"]
cached_methods = ["GET", "HEAD"]
viewer_protocol_policy = "redirect-to-https"
forwarded_values {
query_string = true
cookies {
forward = "all"
}
}
}
restrictions {
geo_restriction { restriction_type = "none" }
}
viewer_certificate {
acm_certificate_arn = aws_acm_certificate.default.arn
minimum_protocol_version = "TLSv1.2_2021"
ssl_support_method = "sni-only"
}
tags = {
client = "self"
}
}
Has anyone else here been able to setup a proxy that works with using a custom domain, or does Docker/ECR do things that absolutely depend on the Host
header or the domain name giving an exact match to $ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com
?
@naftulikay Yes, as per https://github.com/aws/containers-roadmap/issues/299#issuecomment-906901973 however using nginx proxy not CF cdn.
And yes, It was crucial to set the Host header to the fqdn of target private registry in my experience for it to work and auth your creds correctly.
This nginx fargate proxy solution is working in production for us and haven't faced any issues yet, however as noted by @joshm91 above there may be some further config required for EKS workloads. We are only using ECS/Fargate currently so this isn't an issue.
@julienbonastre I'm going to try Lambda@Edge to change the Host
header, because Host
is one header that you're not allowed to set in the Cloudfront distribution configuration. I found the following solution on ServerFault for a NodeJS Lambda@Edge function:
'use strict';
// force a specific Host header to be sent to the origin
exports.handler = (event, context, callback) => {
const request = event.Records[0].cf.request;
request.headers.host[0].value = 'www.example.com';
return callback(null, request);
};
I have it working with the above code. I have ACM for certificate management, Route 53 for DNS, Cloudfront as the edge for the private registry, and using Lambda@Edge to rewrite the Host
header.
Cloudfront Terraform:
resource aws_cloudfront_distribution default {
enabled = true
retain_on_delete = true
comment = "ECR Docker Registry front-end."
aliases = ["docker.mycompany.com"]
is_ipv6_enabled = true
http_version = "http2"
default_root_object = "index.html"
price_class = "PriceClass_100"
origin {
origin_id = "ecr-us-east-1"
domain_name = local.ecr_us_east_1
custom_origin_config {
http_port = 80
https_port = 443
origin_protocol_policy = "https-only"
origin_ssl_protocols = ["TLSv1.2"]
}
}
default_cache_behavior {
target_origin_id = "ecr-us-east-1"
min_ttl = 0
default_ttl = 0
max_ttl = 86400
compress = true
allowed_methods = ["GET", "HEAD"]
cached_methods = ["GET", "HEAD"]
viewer_protocol_policy = "redirect-to-https"
forwarded_values {
query_string = true
headers = ["*"]
cookies {
forward = "all"
}
}
# first thing to do on the way in is to rewrite the host header using our lambda function
lambda_function_association {
event_type = "origin-request"
lambda_arn = aws_lambda_function.host_rewrite.qualified_arn
include_body = false
}
}
restrictions {
geo_restriction { restriction_type = "none" }
}
viewer_certificate {
acm_certificate_arn = aws_acm_certificate.default.arn
minimum_protocol_version = "TLSv1.2_2021"
ssl_support_method = "sni-only"
}
tags = {
client = "self"
}
}
Lambda Terraform:
# NOTE in order for cloudfront proxy to ECR to work, we need to rewrite the `Host` header dynamically. Normally, it
# would be possible to do this in Cloudfront, but Cloudfront does not allow rewriting the `Host` header. Therefore,
# we have a Lambda@Edge function which simply overwrites the `Host` header for us.
resource aws_lambda_function host_rewrite {
function_name = "ecr-docker-host-rewrite"
description = "Rewrites the Host header for incoming requests to ECR to allow custom domains."
runtime = "nodejs14.x"
filename = data.archive_file.lambda_code.output_path
source_code_hash = filebase64sha256(data.archive_file.lambda_code.output_path)
handler = "index.handler"
timeout = 5
memory_size = 128
publish = true
role = aws_iam_role.lambda_host_rewrite.arn
tags = {
client = "self"
}
depends_on = [data.archive_file.lambda_code]
}
The IAM role assigns the provided basic Lambda execution policy, and provides role assumption from both lambda.amazonaws.com
and edgelambda.amazonaws.com
in order for it to work.
The Lambda function code in JavaScript:
#!/usr/bin/env node
// serverfault answer: https://serverfault.com/a/888776/70024
// event data structure: https://docs.aws.amazon.com/lambda/latest/dg/lambda-edge.html
// (many) limitations on Lambda@Edge: https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/edge-functions-restrictions.html
const REGISTRY = "MY_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com";
/**
* Callback function for a Cloudfront Lambda@Edge request event. Rewrites the `Host` header to match the specified
* registry host-name.
* @param event The Cloudfront Lambda event.
* @param context Lambda event context.
* @param callback Callback to fire upon complete.
* @returns {*} Invocation result of callback.
*/
exports.handler = (event, context, callback) => {
const request = event.Records[0].cf.request;
// replace host header with registry url
request.headers.host[0].value = REGISTRY;
return callback(null, request);
}
This works, the important bits are that
aws_cloudfront_distribution.default_cache_behavior.lambda_function_association
is registered to our Lambda function.aws_cloudfront_distribution.default_cache_behavior.lambda_function_association.event_type
is set to origin-request
, so that it will act upon the request sent from an edge to an origin. If it is set to viewer-request
, it cannot modify the request. See the docs for an understanding on the different values here.aws_cloudfront_distribution.default_cache_behavior.forwarded_values
is set to ["*"]
. It might work if you just set this to Host
, but I'm not sure.Inserting console.log
statements into my Lambda@Edge function do not result in CloudWatch logs being created. I'm not sure why. When I run it as a test function with test data, logs are collected. I can't find documentation on what is provided when the event type is origin-request
. If the origin's hostname is included in the event payload, a generic function could be written without hard-coding that will work generically for anyone that uses it.
It was a lot of work to get this together, but I can now log-in to the repository and it works as expected.
@naftulikay I haven't looked at this for a while since it never quite worked how I wanted it, but I definitely managed to get a docker login
working and could pushes/pulls working as well but it would occasionally fail half way through pulling an image with a "401 unauthorized" so there was still some weirdness going on. I think this is the terraform that I had "working". I think the key was to only whitelist the Authorization
header (sent by the client). This way CloudFront continues to send the correct Host
header to ECR.
data "aws_caller_identity" "current" {}
data "aws_region" "current" {}
resource "aws_cloudfront_cache_policy" "ecr" {
name = "ecr"
min_ttl = "1"
parameters_in_cache_key_and_forwarded_to_origin {
cookies_config {
cookie_behavior = "none"
}
headers_config {
header_behavior = "whitelist"
headers {
items = ["Authorization"]
}
}
query_strings_config {
query_string_behavior = "none"
}
enable_accept_encoding_brotli = true
enable_accept_encoding_gzip = true
}
}
resource "aws_cloudfront_origin_request_policy" "ecr" {
name = "ecr"
cookies_config {
cookie_behavior = "all"
}
headers_config {
header_behavior = "whitelist"
headers {
items = ["Accept-Charset", "Accept", "Accept-Language", "Accept-Datetime"]
}
}
query_strings_config {
query_string_behavior = "all"
}
}
resource "aws_cloudfront_distribution" "ecr" {
origin {
domain_name = "${data.aws_caller_identity.current.account_id}.dkr.ecr.${data.aws_region.current.name}.amazonaws.com"
origin_id = "ECR"
custom_origin_config {
http_port = 80
https_port = 443
origin_protocol_policy = "https-only"
origin_ssl_protocols = ["SSLv3", "TLSv1", "TLSv1.1", "TLSv1.2"]
}
}
enabled = true
aliases = ["my.domain.io"]
restrictions {
geo_restriction {
restriction_type = "none"
}
}
default_cache_behavior {
allowed_methods = ["DELETE", "GET", "HEAD", "OPTIONS", "PATCH", "POST", "PUT"]
cached_methods = ["GET", "HEAD"]
target_origin_id = "ECR"
origin_request_policy_id = aws_cloudfront_origin_request_policy.ecr.id
cache_policy_id = aws_cloudfront_cache_policy.ecr.id
viewer_protocol_policy = "redirect-to-https"
}
viewer_certificate {
acm_certificate_arn = var.acm_arn
ssl_support_method = "sni-only"
}
}
@joshm91 Thanks so much for providing your Terraform. My current status of my Cloudfront proxy to ECR:
Trying to push images results in:
$ docker rmi docker.mycompany.com/org/repo:latest
$ docker build -t docker.mycompany.com/org/repo:latest ./
$ docker push docker.mycompany.com/org/repo:latest
The push refers to repository [docker.mycompany.com/org/repo]
fac15b2caa0c: Pushing [==================================================>] 7.168kB
f8bf5746ac5a: Pushing [==================================================>] 3.584kB
d11eedadbd34: Pushing [==================================================>] 4.096kB
797e583d8c50: Pushing [==================================================>] 3.072kB
bf9ce92e8516: Preparing
d000633a5681: Waiting
unauthorized: authentication required
It seems to be able to do most operations but always ends in unauthorized: authentication required
.
Here is my configuration, with large parts adapted from yours.
And here's my current Lambda@Edge function code:
Without the Lambda host rewrite, login fails, so I know that it's doing the right thing at least to get login working.
One of the big pains in my side is that I can't seem to get any logs out of my Lambda function to try to see what is included in the event and context. If I invoke the function from the console using test data, it does write to its CloudWatch logs group, and I have given the Lambda execution role the following permissions:
I might have to use an actual CloudWatch logging library because console.log
simply isn't doing anything, even though I know my function is working.
I'd really like to get this working, and if I could get some help from anyone here to get everything working, I'll gladly publish my results in an open source GitHub repository and potentially to the Terraform Modules registry for others to use.
The last open item to address is getting docker push
working. Is there a way to debug this using the Docker client to see what is being sent and where?
@julienbonastre if you already have an ALB set up, you should be able to edit its listener rule to have the default action redirect to the ecr address, without needing the additional nginx box
@julienbonastre if you already have an ALB set up, you should be able to edit its listener rule to have the default action redirect to the ecr address, without needing the additional nginx box
Um, ok.. @jmchuster , Yes.. I can.. WTAH... I definitely recall trying this originally and obviously correcting the passed Host header to the target ECR FQDN and for some reason it didn't seem to be happy...
However I just attempted it again, and yes, it is working fine for auth/pull/push....
This is clearly a much better approach and less infrastructure required! I'm confused now as to why this didn't work for me initially or what pushed me down the direction of using nginx to do the Host header rewrite......... :scratches head:
Anyway. Awesome! I will refactor now and make this even cleaner!
Ok.. I recant my prior statement @jmchuster .... So now I can see the issue........
I am not sure of the WHY or HOW but it does not work and support docker push
requests when using the listener forwarding method..
However, it works 👌🏻 obviously with my original nginx pattern....
I need to see why, there is obviously something really simple happening in regards to uri req string being either differently handled or similar as the error we receive is:
docker push containers.company.com/team_name/app_name/service:tag-we-want-to-use
The push refers to repository [containers.company.com/team_name/app_name/service]
e2eb06d8af82: Preparing
unsupported: Invalid parameter at 'layerDigest' failed to satisfy constraint: 'Member must satisfy regular expression pattern: [a-zA-Z0-9-_+.]+:[a-fA-F0-9]+'
So, just like that, I am reverting to my original design 👍🏻 🆗 ✅
@naftulikay I got to the same problem. Did you manage to fix it?
@boruttkal @julienbonastre so everything works but push it seems. Is there anything else I need to do to get push working? I think it's an internal Docker request payload thing and would not be easy to rewrite. Basically I think that what is happening is that I build my image:
docker build -t docker.naftuli.wtf/org/repo:latest ./
And then I try to push that, and the actual request payload fails because AWS probably sees a 404: there is no repository identified by the long name. I have tried tagging with multiple tags and pushing to the right place, but it wasn't working last time I checked.
Has anyone got the full lifecycle working, and what commands are you using to interact with the pretty URL as opposed to the ugly AWS direct URL?
@naftulikay Yes, the pattern I described using initially is working perfectly for our org.
It supports docker push myecr.org.com/folder/service:tag
without any issues.
As of yesterday, I also managed a pattern to support this pretty ECR FQDN within ECS task container definitions on ECS EC2/Fargate arrangements - using the existing Private Registry authentication mechanism and a lambda which rotates the docker auth token every 11hrs.
I will summarise and re-post later this morning how this has been achieved but it all definitely works and supports push/pull without concern 🤗👌🏼🚀
What method have you used @naftulikay to proxy your ecr ssl requests?
@julienbonastre Excellent! Glad to hear you found a way to make it work. I'm on the CloudFront :point_right: Lambda side of things, so maybe things are different here.
Can you share the following?
docker build
and tagging of your images; are you tagging with both the ECR URL and your custom URL?docker login
: are you using the ECR URL or your custom URL?docker push
: are you pushing to the ECR URL or your custom URL?docker pull
: are you pulling from the ECR URL or your custom URL?If you could answer these with the actual commands, that would be very helpful so I could know what to expect as far as how to change my implementation to get things working.
Also I know you described your NGINX configuration above, but can you give us a little more info as to how all of this is setup? Do you have one or more NGINX nodes running with public IPs, listening for TLS on 443, modifying the Host
header, and passing to the upstream? Is that all there is to it?
My setup is described in detail above including all the Terraform and the Lambda host-rewrite JavaScript code.
The route from the internet to my ECR URL using the proxy:
Host
header appropriately.As per above, I get strange unauthorized errors when I attempt to push:
$ docker build -t docker.mycompany.com/org/repo:latest ./
$ docker push docker.mycompany.com/org/repo:latest
The push refers to repository [docker.mycompany.com/org/repo]
fac15b2caa0c: Pushing [==================================================>] 7.168kB
f8bf5746ac5a: Pushing [==================================================>] 3.584kB
d11eedadbd34: Pushing [==================================================>] 4.096kB
797e583d8c50: Pushing [==================================================>] 3.072kB
bf9ce92e8516: Preparing
d000633a5681: Waiting
unauthorized: authentication required
I'm able to docker login
and docker pull
, but docker push
is not working.
My theory as to why it's not working is that the internal Docker HTTP payloads specify the image name using the custom URL docker.mycompany.com/org/repo:latest
and not MY_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/org/repo:latest
, and when ECR receives this image with a URL that does not match MY_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com
, it rejects the push request, which is why I wonder if you're doing something special for the docker push
case.
EDIT: If I'm able to get all of this working with CloudFront, ACM, and Lambda, I will (:pray:) publish my code as a Terraform module to the Terraform module registry so that others can do this without any hassle.
Tell us about your request Currently a repository URI looks like this:
<account_id>.dkr.ecr.<region>.amazonaws.com/<repository>
. Account ID and region might be movable parts which has negative effects for the following scenarios described. It would be helpful to be able to define an alternate URI for ECR repositories.Which service(s) is this request for? ECR, (.. and maybe other container services?)
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? Our team provides a docker image for ~12 other teams that acts as a build tool for frontend resources within their pipelines. We identified different disaster recovery scenarios where the current ECR URI is a disadvantage:
(1) Unavailability of ECR within the specified region
(2) Disaster recovery for the ECR account
An alternate repository URI could be a fixed interface for other consumers. Changes for account ID or region behind this part would not affect them anymore.
Are you currently working around this issue? One way we had seen to "solve" this, is to use an nginx as reverse proxy for the ECR, but this is an effort we don`t want to practice.
Additional context This topic from January 2016 also describes some pain points with this.
Attachments