aws / apprunner-roadmap

This is the public roadmap for AWS App Runner.
https://aws.amazon.com/apprunner/
Other
301 stars 14 forks source link

Make AppRunner infrastructure more transparent #109

Open Ickbinet opened 2 years ago

Ickbinet commented 2 years ago

Community Note

Tell us about your request

Make AppRunner infrastructure more transparent.

I tried to create a internet accessing apprunner service connected to the default vpc via a vpc-connector. Access is blocked. I GUESS because the tasks are running inside private networks and need a NAT gateway. But...

Describe alternatives you've considered

Make these infrastructure things visible to allow troubleshooting.

spesnova commented 2 years ago

@Ickbinet

Here's the blog explaining the VPC Connector under the hood: Deep Dive on AWS App Runner VPC Networking

spesnova commented 2 years ago

@Ickbinet

Is my understanding correct?

If yes, I guess the Security Group which you associated with the VPC connector blocks the App Runner service accesses to the internet. In general, the default VPC has three public subnets which are directly reachable to internet(Internet Gateway). So you don't need NAT Gateway in this case.

If you created private subnets in the default VPC and associated the VPC Connector with them, probably you are missing NAT Gateway in the default VPC which allows AWS resources to access to the internet from the private subnets. Also, don't forget to configure the Security Group's outbound rule to allow internet access.

I hope this helps.

Ickbinet commented 2 years ago

@spesnova You understood all correct. Sorry for the delayed answer.

The VPC Connector is connected to all subnets of the default vpc and to the default sg of the default vpc. The default sg has an outbound rule which allows all traffic.

I read the deep dive doc already and all they say is: "Note that if you have traffic destined for the internet, you must enable the appropriate path via NAT and Internet Gateways in your VPC."

I have already an aws support call open (since 8 days...) and it seems they have no idea too :)

spesnova commented 2 years ago

@Ickbinet

Thanks for getting back to me.

Let's deconstruct the issue to App Runner/VPC Connector side issue or VPC/SG side issue if it works for you.

  1. Running an EC2 instance or ECS task attaching to the default SG in the default VPC
  2. Entering the instance/task
  3. Running a command to check the instance/task can access to internet (e.g. curl https://api.github.com/zen)

As you already understand, if you can access to the internet from the instance/task, your VPC/SG settings are correct and App Runner/VPC Connector side has an issue. In this case, not only App Runner(and the VPC connector feature) itself but also your app running on App Runner can have the root cause.

If you can't access to the internet from the instance/task, at least your VPC/SG settings are not correct.

Ickbinet commented 2 years ago

Ec2 can access the internet. AppRunner not.

I found out that the AppRunner Tasks are always running inside private subnets on Fargate and a NAT is required.

Since a VPC Connector cannot be updated and I cannot create a new one (there is only one sg in the VPC and creating multiple Connectors with same sg is not possible), I am deadlocked now. I have to delete everything in AppRunner and start from scratch. As far as I understand, the default vpc is not suitable for an AppRunner app which requires VPC and internet access, because it has already 3 public subnets which consume the whole CIDR range. You have to delete one and create a private with a NAT. All this is only possible if no resource already uses the subnet. All this should be documented better.

Will try this later...

Am Mi., 23. Feb. 2022 um 13:52 Uhr schrieb Seigo Uchida < @.***>:

@Ickbinet https://github.com/Ickbinet

Thanks for getting back to me.

Let's deconstruct the issue to App Runner/VPC Connector side issue or VPC/SG side issue if it works for you.

  1. Running an EC2 instance or ECS task attaching to the default SG in the default VPC
  2. Entering the instance/task
  3. Running a command to check the instance/task can access to internet (e.g. curl https://api.github.com/zen)

As you already understand, if you can access to the internet from the instance/task, your VPC/SG settings are correct and App Runner/VPC Connector side has an issue. In this case, not only App Runner(and the VPC connector feature) itself but also your app running on App Runner can have the root cause.

If you can't access to the internet from the instance/task, at least your VPC/SG settings are not correct.

— Reply to this email directly, view it on GitHub https://github.com/aws/apprunner-roadmap/issues/109#issuecomment-1048751326, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMLY2RR7A2FHJ2LMFC3PX6DU4TKCFANCNFSM5OJU3QPQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

spesnova commented 2 years ago

@Ickbinet

As a workaround, you can create a new VPC connector with your App Runner service:

  1. Go to App Runner Console
  2. App Runner services
  3. Configure service -> Edit
  4. Networking
  5. Custom VPC -> Add new (VPC Connector)

Then you can attach your App Runner service with another Security Group in the default VPC or connect it to another VPC. You don't need to delete everything 👍

hossameldeen commented 2 years ago

@spesnova You understood all correct. Sorry for the delayed answer.

The VPC Connector is connected to all subnets of the default vpc and to the default sg of the default vpc. The default sg has an outbound rule which allows all traffic.

I read the deep dive doc already and all they say is: "Note that if you have traffic destined for the internet, you must enable the appropriate path via NAT and Internet Gateways in your VPC."

I have already an aws support call open (since 8 days...) and it seems they have no idea too :)

@Ickbinet I'm at this point. Did the support get back to you and/or have you managed to give outbound Internet access to your App Runner service?

hossameldeen commented 2 years ago

@spesnova I have this situation indeed. But the Security Group allows All Outbound traffic to 0.0.0.0/0. Still, my App Runner service has no outbound Internet access. What am I doing wrong?

Is my understanding correct?

You're creating an internet-facing App Runner service The App Runner service requires internet access The internet access from the App Runner service is blocked You can access the App Runner service itself via the default domain (generated by App Runner) You associated the VPC Connector with public subnets in the default VPC If yes, I guess the Security Group which you associated with the VPC connector blocks the App Runner service accesses to the internet. In general, the default VPC has three public subnets which are directly reachable to internet(Internet Gateway). So you don't need NAT Gateway in this case.

For reference, here is my configuration:

VPC Connector configuration on App Runner service:

image

Security group outbound rule allowing all traffic:

image

Route table routing outbound traffic to internet gateway:

image

Route table association with all subnets (non-explicit. Default, didn't change):

image

How I know that my service has no outbound Internet access:

I understand you don't have something reproducible you can try yourself. And admittedly, I have tried this code only locally; I didn't try it in EC2 to see if the problem is within AWS or App Runner. But if you could read the configuration above & let me know if you think there's in it that prevents outbound Internet access (or that there isn't), I'd be grateful.

I hope I'm still in the spirit of the original question :)

hossameldeen commented 2 years ago

The answer to my question above: https://stackoverflow.com/a/74253516/6690391

adonig commented 1 year ago

@hossameldeen Did anyone from AWS confirm that the VPC Connector is required to have a NAT Gateway? I spent like three days trying to get my app deployed with CDK, App Runner and RDS and in the end I was able to track all my issues down to the Docker container not having any internet access despite I associated the VPC Connector with the public subnets of my VPC.

If it is the case that the VPC Connector always needs to be connected to the private subnets having NAT Gateways I can only ask AWS to make it clear from the documentation so people don't waste their time trying to set up a VPC Connector without NAT Gateways. Even better would be to prevent the VPC Connector from getting associated with public subnets by raising an appropriate error.

jvisker commented 1 year ago

@hossameldeen Did anyone from AWS confirm that the VPC Connector is required to have a NAT Gateway? I spent like three days trying to get my app deployed with CDK, App Runner and RDS and in the end I was able to track all my issues down to the Docker container not having any internet access despite I associated the VPC Connector with the public subnets of my VPC.

If it is the case that the VPC Connector always needs to be connected to the private subnets having NAT Gateways I can only ask AWS to make it clear from the documentation so people don't waste their time trying to set up a VPC Connector without NAT Gateways. Even better would be to prevent the VPC Connector from getting associated with public subnets by raising an appropriate error.

https://docs.aws.amazon.com/apprunner/latest/dg/network-vpc.html#network-vpc.considerations-subnet

adonig commented 1 year ago

Hey @jvisker! Thank you for providing me with a link to the official documentation. Very nice of you 👍

To keep it purely constructive, let me explain to you what part of the documentation made me consider that it is possible that VPC connectors in public subnets have internet access through the VPC's internet gateway:

"All networking rules for the VPC apply to the outbound traffic of your application."

According to the networking rules for my VPC the public subnets have internet access through the internet gateway.

What I miss in the documentation is a sentence like "Note that associating a VPC connector with the public subnets of a VPC still doesn't provide the App Runner service with access to the internet because the ENIs of the VPC connector by default don't come with public IP addresses."

So maybe someone might add that sentence somewhere in the documentation to make it more clear.

It even happens that the CDK documentation of the VPC connector happens to include an example of how to use it, in which the VPC connector gets associated with the public subnets only:

import * as ec2 from '@aws-cdk/aws-ec2';

const vpc = new ec2.Vpc(this, 'Vpc', {
  cidr: '10.0.0.0/16',
});

const vpcConnector = new apprunner.VpcConnector(this, 'VpcConnector', {
  vpc,
  vpcSubnets: vpc.selectSubnets({ subnetType: ec2.SubnetType.PUBLIC }),
  vpcConnectorName: 'MyVpcConnector',
});

new apprunner.Service(this, 'Service', {
  source: apprunner.Source.fromEcrPublic({
    imageConfiguration: { port: 8000 },
    imageIdentifier: 'public.ecr.aws/aws-containers/hello-app-runner:latest',
  }),
  vpcConnector,
});

So of course I consider it to be logical that associating a VPC connector with the public subnets might provide my App Runner service with internet access. What's explicitly missing in that example is that it only works, because the VPC was created implicitly by default with three NAT gateways which happen to cost around $112 per month.

And of course no one wants to spend that much money for something they don't really intend to use to. So people like me assume, that if App Runner is able to provide internet access to an application without the VPC connector without having to pay for a NAT gateway (actually there is one, but you don't have to pay for it), it might also be possible that it has internet access after attaching a VPC connector to public subnets, but that assumption is mostly false.

It is not completely false because according to this Deep Dive into App Runner VPC Networking the VPC connector creates Hyperplane ENIs in the associated subnets and about seven months ago a creative person found a way to provide internet access to an App Runner service with a VPC connector associated with the public subnets without NAT gateways just by associating EIPs to those ENIs.

So in the end I am left with a decision: Should I put each of my App Runner services into its own VPC without NAT gateways by manually (or by CDK) associating public IP addresses to the ENIs created by the VPC connector, or should I create a single VPC having NAT instances (because NAT gateways are too expensive) and put all my App Runner services into that single VPC?

Note that you don't have to answer that question. I just put this here so people like me who spend days trying to setup App Runner with serverless RDS under budget can find an overview of their options.

adonig commented 1 year ago

For anyone interested in how to attach the EIPs to the ENIs of the VPC connector programmatically using the CDK, you can put this for-loop after your App Runner service declaration:

        for index, subnet_id in enumerate(vpc_connector.subnets):
            hyperplane_eni = cr.AwsCustomResource(self, f"HyperplaneEni{index}",
                on_update=cr.AwsSdkCall(
                    physical_resource_id=cr.PhysicalResourceId.of(vpc_connector.vpc_connector_name),
                    service='EC2',
                    action='describeNetworkInterfaces',
                    parameters={
                        "Filters": [
                            {"Name": "subnet-id", "Values": [subnet_id]},
                            {"Name": "interface-type", "Values": ["fargate"]},
                        ]
                    }
                ),
                policy=cr.AwsCustomResourcePolicy.from_sdk_calls(
                    resources=cr.AwsCustomResourcePolicy.ANY_RESOURCE,
                ),
            )
            hyperplane_eni.node.add_dependency(service)
            eip = ec2.CfnEIP(self, f"HyperplaneEni{index}EIP")
            ec2.CfnEIPAssociation(self, f"HyperplaneEni{index}EIPAssociation",
                allocation_id=eip.attr_allocation_id,
                network_interface_id=hyperplane_eni.get_response_field('NetworkInterfaces.0.NetworkInterfaceId'),
            )

It iterates over the VPC connector subnets, uses an AWSCustomResource to get the hyperplane ENI and associates a freshly allocated EIP to it.

EDIT: Now I can also confirm that this approach works and I have an App Runner service connected to my serverless RDS database in a private subnet and connected to the internet via a VPC connector in public subnets without using any expensive NAT Gateways or NAT instances 🚀

image

Screenshot from 2022-12-19 02-35-30

Screenshot from 2022-12-19 02-36-04

adonig commented 1 year ago

It turned out that there might be a problem with scaling, because there are only five EIPs available to your account and the VPC connector might create more ENIs than EIPs, so you can't even handle it with a lambda function that periodically checks for hyperplane ENIs without a public IP.

Maybe someone from AWS can confirm, that the VPC connector might create more than one ENI per subnet. If that's the case maybe AWS can consider keeping App Runner's own free NAT gateway in case the user associates the VPC connector with public subnets or isolated subnets and only route traffic through the hyperplane ENIs which is intended to go to private VPC resources. IMO that would make App Runner the perfect container service.

Until then for people who don't want to spend $96 per month for three NAT Gateways and 50% more for outgoing traffic, it might be a better solution to use Fargate 😔