jetbridge / cdk-nextjs

Deploy a NextJS application using AWS CDK
https://constructs.dev/packages/cdk-nextjs-standalone
Apache License 2.0
265 stars 45 forks source link

Improve Security Of Public Lambda Function URL #108

Closed bestickley closed 1 year ago

bestickley commented 1 year ago

Currently this construct deploys Lambda Function URLs with AuthType of NONE. This allows anyone to call the function if they know the Lambda URL. A more secure approach would be to put the Lambda behind an API Gateway which verifies a secret header set by CloudFront and have the CloudFront behavior call the API GW. This would ensure Lambdas can only be called through CloudFront. Is there any interest from the maintainer of this project/community for such a feature? I'd possibly be interested in contributing.

bestickley commented 1 year ago

If there are any other techniques that could avoid public Lambda Function URLs, I'd be interested to hear other ideas!

revmischa commented 1 year ago

You are welcome to contribute something if you want. What exactly is the security threat of someone hitting the lambda URL directly? What could an attacker do?

NextjsSite from SST uses function URLs too (though it and this project sort of feed off each other): https://github.com/serverless-stack/sst/blob/master/packages/sst/src/constructs/NextjsSite.ts#L292

khuezy commented 1 year ago

One thing I can think of would be a Denial-of-Wallet attack since it's bypassing the CDN.

bestickley commented 1 year ago

@revmischa, great! I'll begin working on something. I just wanted to make sure it wouldn't be a waste of time.

Security threats quoted from this blog include:

The first 3 are unlikely because Open Next generates the lambda function code, but it's possible and the fix is pretty simple.

Relevant articles:

To enable this feature, I'm thinking I could add a top level property on the construct called serverType with a type value of an enum like below:

enum ServerType {
  LambdaUrl = "LambdaUrl";
  LambdaApiGateway = "LambdaApiGateway";
}

This would default to ServerType.LambdaUrl. I could also foresee someone maybe adding AppRunner or other configurations for running the next server in the future but that's outside of the scope of this issue. I'd also want to allow users to be able to edit the API Gateway props so I'd add an optional property onto the defaults property. What do you think?

revmischa commented 1 year ago

Honestly I think this is a waste of time. Can you name an actual specific attack someone could perform by having the lambda function URL? I read that blog, it was vague, authentication doesn't apply here.

bestickley commented 1 year ago

@revmischa, I understand your perspective but my security team would beg to differ :) Are you still open to a PR to add this functionality? It would of course be opt-in. Default would be to use Lambda URLs.

revmischa commented 1 year ago

I'm open to it sure. I'm curious what actual attack your security team thinks someone could do here. So far it sounds like security theater. I read the blog posts and don't believe they apply to this function. They talk about bypassing authentication, which does not apply here. The function is invoked via HTTP, there are no other parameters to bypass either besides the HTTP request.

You are welcome to direct your security team here to lay out what their actual threat model is here. If it's just pointing at blog posts which highly irrelevant concerns, they will have to explain at least theoretically what could be exploited here.

UltimateForm commented 1 year ago

Hello @bestickley does this apply at all for api routes without authentication (which is how most of my applications are set up)

And if we discard the authentication bypass concerns, is the only concern that someone could grab the function url and repeat a bunch of requests in an attempt to cause financial loss through an extrapolation of runtime costs?

PS: not a security person , so sorry in advance if the question is stupid :)

khuezy commented 1 year ago

yea I'm not a security person either. I'm scratching my head on the actual security concern here. The only argument that makes sense for me is the server runtime cost a bad actor can cause.

revmischa commented 1 year ago

Can ou explain how an API gateway prevents some possible attack here? Now you are paying for the APIGW requests too. I am extremely unconvinced there is a problem here or that adding an APIGW will make anything better vs. adding unnecessary complexity and cost.

derekmurawsky commented 1 year ago

Hi, I am a security person. Sort of. It's complicated. 😆 We were just chatting about this in the discord. The main security risk I see would be a denial of wallet attack. You could also possibly eat up the concurrency limit for the account/region causing other lambdas to be throttled. Though the risks are real, the recommendation to use API Gateway by itself will not address those risks. You would need to use API Gateway with a Web Application Firewall and DDOS prevention rules on it to truly address the cited risks. And doing that would go against the whole point of a function URL. They are supposed to be simpler versions of a API gateway for only one purpose. This is also why many larger enterprises don't want to use them. A simpler solution to mitigate the above risks would be to turn on concurrency limits on the function itself. That would contain the blast radius of a DDOS attack to just that function URL. Or just use API Gateway instead, with a WAF, and enjoy the complexities of that deployment. I'm not sure what methods are currently supported as I was just dragged into this conversation from an external perspective, but I would likely want to support both deployment methods to increase adoptability if they are not both already supported. 😃

khuezy commented 1 year ago

That makes sense. Thanks for your expertise and input!

bestickley commented 1 year ago

From security team's documentation:

... a Lambda function in your account that allows any user in AWS to perform operations with the function. This may allow third parties to invoke or modify your functions, or use the permissions granted to the function's associated execution role.

I think the key issue is the resource based policy statements on the server and image optimization functions that have "Principal": "*". This means anyone can invoke your function (not just visiting url) with whatever parameters they want. The functions that Open Next creates shouldn't be able to be invoked in such a way that an attacker can do something harmful (besides DoS, DoW), but it's a possibility if that code changes in future.

{
  "Version": "2012-10-17",
  "Id": "default",
  "Statement": [
    {
      "Sid": "ss-stickb-ui-NextSiteImgOptFninvokefunctionurl4EE7A243-N6GPZTVZJWUP",
      "Effect": "Allow",
      "Principal": "*",
      "Action": "lambda:InvokeFunctionUrl",
      "Resource": "arn:aws:lambda:us-east-1:1234567890:function:ss-stickb-ui-NextSiteImgOptFnCE453280-02ZhQYjuH0yy",
      "Condition": {
        "StringEquals": {
          "lambda:FunctionUrlAuthType": "NONE"
        }
      }
    }
  ]
}

My security team also reminded me it's about Defense-in-Depth. This is an added layer of security that may not be needed because OpenNext's outputted lambda code is secure, but it's still valuable.

@derekmurawsky, I agree adding WAF to API GW is good idea, but I'm not sure this is right:

Though the risks are real, the recommendation to use API Gateway by itself will not address those risks

Because adding API GW with Authorizer that ensures header set by CloudFront URL won't allow an attacker to invoke the function. Make sense?

derekmurawsky commented 1 year ago

And as I was thinking about it, there is one silly thing you could do if you really wanted to switch the AUTH type to IAM... You could have the CloudFront distribution call a lambda to modify the request (via the Origin request lambda in the CDN) and sign it with a valid auth token to trigger the function URL. That would mean using two lambdas per call, but it would lock it down to only the cloudfront distro being able to make the call... But you would have to cache that auth token in the cloudfront lambda very carefully so as not to flood AWS IAM in a DDOS scenario.

  1. External entity calls cloudfront
  2. Cloudfront invokes lambda to modify request (sign it) before passing it to the lambda function URL
  3. Cloudfront passes modified request to function url
  4. Function url validates authorization
  5. Function URL runs as normal

This is a bit silly, imho, but would work.

https://docs.aws.amazon.com/lambda/latest/dg/urls-auth.html

derekmurawsky commented 1 year ago

From security team's documentation:

... a Lambda function in your account that allows any user in AWS to perform operations with the function. This may allow third parties to invoke or modify your functions, or use the permissions granted to the function's associated execution role.

I think the key issue is the resource based policy statements on the server and image optimization functions that have "Principal": "*". This means anyone can invoke your function (not just visiting url) with whatever parameters they want. The functions that Open Next creates shouldn't be able to be invoked in such a way that an attacker can do something harmful (besides DoS, DoW), but it's a possibility if that code changes in future.

{
  "Version": "2012-10-17",
  "Id": "default",
  "Statement": [
    {
      "Sid": "ss-stickb-ui-NextSiteImgOptFninvokefunctionurl4EE7A243-N6GPZTVZJWUP",
      "Effect": "Allow",
      "Principal": "*",
      "Action": "lambda:InvokeFunctionUrl",
      "Resource": "arn:aws:lambda:us-east-1:1234567890:function:ss-stickb-ui-NextSiteImgOptFnCE453280-02ZhQYjuH0yy",
      "Condition": {
        "StringEquals": {
          "lambda:FunctionUrlAuthType": "NONE"
        }
      }
    }
  ]
}

My security team also reminded me it's about Defense-in-Depth. This is an added layer of security that may not be needed because OpenNext's outputted lambda code is secure, but it's still valuable.

@derekmurawsky, I agree adding WAF to API GW is good idea, but I'm not sure this is right:

Though the risks are real, the recommendation to use API Gateway by itself will not address those risks

Because adding API GW with Authorizer that ensures header set by CloudFront URL won't allow an attacker to invoke the function. Make sense?

Ah, I see the assertion here, but I think it may be flawed. If the policy that is created is what you put above, it means that only the functionURL can call that function because of the condition key. So "This means anyone can invoke your function (not just visiting url) with whatever parameters they want." isn't correct because if you just called the function from another AWS role, it wouldn't have the "lambda:FunctionUrlAuthType" context so that policy wouldn't apply. Since that is only assigned by AWS and the IAM level, I don't think it could easily be bypassed. It should be easy enough to test that, though.

If a user got the function URL's URL, then yes, they could point anything they want at it. That's why FunctionURL lambdas must handle their own auth, verification, etc. By design, a function url with an auth type of none is meant to be invoked by anyone, anywhere. The code within it, and the role that the lambda has should be very carefully considered with that fact in mind. That is literally the purpose of a function URL. The principle of * must be there for it to be able to do that. As such, I would never use a function url for a high-visibility or sensitive app. I would absolutely want an API Gateway and WAF to handle potential adverse scenarios. Optionality and Flexibility are key at scale with sensitive stuff. On a related note: I do suspect they will eventually add first class support for them to cloudfront, just like they did for S3. It's too convenient a pattern not to.

Regarding "Because adding API GW with Authorizer that ensures header set by CloudFront URL won't allow an attacker to invoke the function. " - But then you're not using a function URL at all and, instead, just invoking the function from API gateway. You're transforming the cost structure a bit here and running Cloudfront + api gateway + a lambda authorizer + the functional lambda to... prevent someone from calling the function URL directly. Which is the point of a function URL. I agree that's a better and more enterprise-y pattern in general, but I think it adds complexity for scenarios where you may not need it. Personally, I would support multiple deployment options for different scenarios. See my previous comment for a way to make it work just using Cloudfront and Function URLs.

bestickley commented 1 year ago

@derekmurawsky, you're right! Thanks for correcting me. That's good that calling the Lambda function is restricted to the calling it through the function URL.

Your recommendation for IAM AUTH for function URL is very interesting. Maybe that would be a better option than API Gateway. Do you think this construct should support both? This makes me think API GW option is still needed:

The principle of * must be there for it to be able to do that. As such, I would never use a function url for a high-visibility or sensitive app. I would absolutely want an API Gateway and WAF to handle potential adverse scenarios.

The more I think about it, the more I like your creative simple solution of signing the Lambda Function URL and using IAM AUTH type. Very smart! I wonder how the costs/latency compare to API GW solution.

derekmurawsky commented 1 year ago

@bestickley I think that constructs should support options intelligently. I don't know enough about this construct to say definitively one way or the other. However, I do think it would be good if there was a way to set the function url auth type, and a way to add to the cloudfront distribution behaviors setting "Origin request lambda" that are generated within the construct. That includes things like the headers policy. See this feature request over at SST for some logic as to why. The TLDR is that since there is no other way to set behavior-level things once a behavior is instantiated in CDK, all behavior level settings should be bubbled up in any construct that creates them.

In this scenario, instead of header policies (which I think needs to be supported as well) it would be assigning additional functions to the behaviors that are created. This just increases optionality to more easily support specific use cases but leaves the default path as-is.

bestickley commented 1 year ago

@derekmurawsky, I agree with this:

I do think it would be good if there was a way to set the function url auth type, and a way to add to the cloudfront distribution behaviors setting "Origin request lambda" that are generated within the construct

I could add top level props to enable customization of this, but I think that solution would be temporary. Eventually there will be a request for additional customization.

Working on something now that I think will accommodate this. Main idea is that you'll be able to extend this construct and override methods to get functionality you want without having to worry about how all the other pieces of the construct are organized. For example, you could do:

import { Construct } from "constructs";
import { Nextjs, NextjsProps, NextjsServerOriginProps, NextJsLambda, NextjsDistribution } from "cdk-nextjs-standalone";
import * as origins from 'aws-cdk-lib/aws-cloudfront-origins';
import * as lambda from 'aws-cdk-lib/aws-lambda';

class MyNextjs extends Nextjs {
  constructor(scope: Construct, id: string, props: NextjsProps) {
    super(scope, id, props);
  }

  createNextjsServerOrigin(props: NextjsServerOriginProps): origins.HttpOrigin {
    const server = new NextJsLambda(...);
    // create function url that requires iam auth and cloudfront http origin
    // return http origin that you create and will be consumed by CloudFront distribution
  }

  createDistribution(props: NextjsDistributionProps): NextjsDistribution {
    return new MyNextjsDistribution(this, "Distribution", props);
  }
}

class MyNextjsDistribution extends NextjsDistribution {
  constructor(scope: Construct, id: string, props: NextjsDistributionProps) {
    super(scope, id, props);
  }
  getLambdasAtEdge(): lambda.EdgeLambda {
    // create lambda@edge that signs url with IAM. This would also have to include the fix for host header
  }
}

@revmischa and others, thoughts? I know this is a sizable change, but I think it will add a lot of customization possibilities and value to future users who want further customization. I'd ensure no breaking changes to top level construct props as they would stay the same, just new non-private methods (so they can be overriden) would be added to Nextjs and NextjsDistribution. Although, I would need to change what NextjsDistribution accepts from functions to cloudfront http origins. This will enable more flexibility for future users to use compute other than lambda functions.

revmischa commented 1 year ago

Your approach of subclassing functions could work, it's not a bad idea. My original idea was more like letting people simply compose their own "top-level" Nextjs construct out of the various blocks they want to use. But it is nice to not have to create that thing from scratch to customize just one or two pieces of it and it probably exposes too much implementation detail.

So I'm generally cool with your idea here.

bestickley commented 1 year ago

Thinking more about the subclassing approach, the downside to it is it adds a lot more to the "public API" meaning that making updates to this construct in the future is more likely going to result in a breaking change which isn't good. However, for max customization this may be required. I'll open up another issue which focuses on allowing different types of compute which is really the "user story" for using the subclass approach.

I've created an PR to address the potential security issue (I say potential because I know some don't see it as a security issue) which implements @derekmurawsky's creative idea to use the Lambda@Edge Origin Request function to sign the request to the Lambda Function URL. Thanks to everyone for their thoughts! Feedback on PR welcome :)