aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.47k stars 3.83k forks source link

(@aws-cdk/cloudfront_origins): S3Origin synths an incorrect domain url and no OAI association for buckets with public access disabled, and static hosting enabled #19539

Open its-mirus-lu opened 2 years ago

its-mirus-lu commented 2 years ago

What is the problem?

TL;DR

Using @aws-cdk/cloudfront_origins S3Origin results in an origin that is unreachable (403 Forbidden) if the bucket has public access disabled and static hosting enabled. From the code there is a check to see if the bucket is a static hosted website and creating an HTTPOrigin if it is. The problem is that HTTPOrigin's don't allow for OAI's to access bucket contents... which is needed if the bucket has public access disabled.

Background

I'm contributing an update to the aws-cdk-examples repo for static sites hosted on s3 here -> https://github.com/its-mirus-lu/aws-cdk-examples/tree/fix_static_site_distro/typescript/static-site using CDK v2

I noticed that when I deploy the app, I'm getting 403 errors when trying to hit the website both via the CNAME registered in Route53 and via the Cloudfront endpoint.

Digging into the synthesized Cloudformation, I discovered that the distribution definition is missing an OAI association even through the CDK code uses the S3Origin class with the OAI defined in the props.

I discovered through the AWS Console that the URL was incorrect (the static website url was used instead of the REST url) which also prevented the OAI from being associated.

Reproduction Steps

1) Attempt to create a stack with an S3 bucket with static hosting enabled and with public access denied 2) Create an OAI and grant it getObject permission in the bucket policy 3) Create a Cloudfront distribution and define a defaultBehavior with an S3Origin as its props (using the bucket and OAI defined previously) 4) Attempt to deploy

All the above steps can be found here: https://github.com/its-mirus-lu/aws-cdk-examples/tree/fix_static_site_distro/typescript/static-site

What did you expect to happen?

For s3 buckets that host static content that have public access disabled, the following must be configured:

1) the origin's domain should be the s3 rest url, not the s3 static website url 2) an OAI should be associated with the origin in Cloudfront 3) the website can be navigated to without error

Note: Here is a link to an article on AWS about debugging 403 errors in Cloudfront and S3... the specific excerpt is:

"If you don't want to allow public (anonymous) access to your S3 objects, then change your configuration to use the S3 REST API endpoint as the origin of your distribution. Then, configure your distribution and S3 bucket to restrict access using an origin access identity (OAI). For instructions, see Using a REST API endpoint as the origin with access restricted by an OAI"

This is the URL that is synthed .s3-website.us-east-1.amazonaws.com This is the URL that should be synthed .s3.us-east-1.amazonaws.com

What actually happened?

If I visit the Cloudfront section in the AWS console I see the following:

1) Navigate to the Cloudfront section of the AWS Console: 2) the origin's domain is the s3 static website url (it should be the s3 bucket's REST endpoint) 3) there is no option for OAI association (I only see the option when I change the origin domain to point to the s3 bucket's REST url) 4) The OriginConfig that is synthed is a CustomOriginConfig and not an S3OriginConfig as expected

Furthermore, looking at the synthesized Cloudformation, I do not see an OAI associated with the origin, and the domain used is the static website URL

"Origins": [
            {
              "CustomOriginConfig": {
                "OriginProtocolPolicy": "http-only",
                "OriginSSLProtocols": [
                  "TLSv1.2"
                ]
              },
              "DomainName": {
                "Fn::Select": [
                  2,
                  {
                    "Fn::Split": [
                      "/",
                      {
                        "Fn::GetAtt": [
                          "StaticSiteSiteBucket1A888BC8",
                          "WebsiteURL"
                        ]
                      }
                    ]
                  }
                ]
              },
              "Id": "MyStaticSiteSiteDistributionOrigin1782C1326"
            }
          ],

Looking at the actual code for S3Origin the constructor actually returns an HTTPOrigin (instead of an S3Origin) if the bucket has static hosting enabled. The solution for this would be to do an 'and' so the test condition becomes

this.origin = bucket.isWebsite && bucket.<isPublicRead>? where isPublicRead is a helper function to determine if public read is enabled.

CDK CLI Version

2.13.0

Framework Version

No response

Node.js Version

16.14.0

OS

12.2.1

Language

Typescript

Language Version

TS 3.9.7

Other information

The workaround is that static hosting should be disabled for any s3 hosting static website content that has public read disabled.

rpivo commented 2 years ago

Not sure if directly related to the same issue, but when migrating from using CloudFrontWebDistribution to Distribution, the single origin I had for the cloudfront no longer has an attached origin access identity.

Previously, I was manually creating an origin access identity and attaching it in the CloudFrontWebDistribution configuration for the origin.

According to the docs (see here: https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_cloudfront-readme.html#behaviors), creating an S3 origin with S3Origin should automatically produce an origin access identity.

However, I don't see any attached origin access identity to the origin after deploying, and no new origin access identity can be found in the origin access identity dashboard.

I also manually tried adding a new origin access identity in the Distribution config:

    const originAccessidentity = new OriginAccessIdentity(...);
    const bucket = new S3Bucket(...);

    new Distribution(this, 'StaticWebsiteDistribution', {
      ...
      defaultBehavior: {
        allowedMethods: AllowedMethods.ALLOW_ALL,
        origin: new S3Origin(bucket, {
          originAccessIdentity: originAccessidentity,
          originPath: '/v1',
        }),
      },
     ...
    });

This also didn't seem to attach an origin access identity to the origin.

Not sure if there are any other ways to get this working with Distribution and will likely fall back to using CloudFrontWebDistribution for now.

Hunter-Thompson commented 2 years ago

I have a similar issue.

I want both, the HTTP endpoint of the website, and the REST endpoint, but since bucket.isWebsite() is being read to determine the URL, it makes things messy.

Can we remove this parsing? What if someone wants the REST endpoint, even though isWebsite() is true.

yangliunewyork commented 2 years ago

I encounter a similar problem here: https://stackoverflow.com/questions/72051986/get-403-error-when-trying-to-use-cloudfront-oai-for-s3-bucket-access.

ianwow commented 2 years ago

I was able to solve this problem by specifying default_root_object='index.html' in the cloudfront.Distribution object instead of website_index_document='index.html' in the s3.Bucket object. See https://github.com/aws/aws-cdk/issues/14019.

LoganArnett commented 1 year ago

Has there been any movement on this issue? The proposed fixes do not work with the CDK itself. Making changes manually can update but currently I still cannot use the CDK to deploy an S3 bucket that is private to anything other than Cloudfront without getting the 403 error

LoganArnett commented 1 year ago

I found a different work around but again it isn't what I would expect to be limited to, I removed the websiteIndexDocument and websiteErrorDocument and then added errorResponses to the CDN Distribution:

errorResponses: [
  {
    httpStatus: 404,
    responseHttpStatus: 200,
    responsePagePath: '/index.html'
  },
  {
    httpStatus: 403,
    responseHttpStatus: 200,
    responsePagePath: '/index.html'
  },
  {
    httpStatus: 400,
    responseHttpStatus: 200,
    responsePagePath: '/index.html'
  }
]
ChrisLane commented 1 year ago

Similarly related with OAI not being attached to a bucket's policy.

I just found a bug in my stacks where the OAI was not being added to a bucket's policy if the bucket was fetched with Bucket.fromBucketArn(), instead I have to pass the bucket into the stack and use it more directly (opening another can of worms with cyclic dependencies). It would be nice if a warning was thrown when the policy fails to be added to a bucket.

Hopefully this helps anyone else in the same situation.