aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.21k stars 320 forks source link

[ECS] [request]: scheduled fargate task with internet access #321

Open fprochazka opened 5 years ago

fprochazka commented 5 years ago

Tell us about your request I want to be able to create Fargate Scheduled Tasks with access to internet.

Which service(s) is this request for? Fargate, ECS, Scheduled Tasks

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? I have several "cron jobs", that I want to start once a week and pull some data from internet, process them and store them in our DB/S3.

This is annoying - not being able to pull the image from our gitlab registry. But I suppose it can be fixed by using ECR. I'm not happy about it, but this I can work with.

image

But what is a big major problem is not being able to access the internet from within the container.

I've read the documentation and I understand why this doesn't work. But I don't understand why did you create it this way. It is the same problem as Lambda. You have to have NAT Gateway otherwise it doesn't work. But NAT Gateway is charged by the hour with additional enourmous cost for traffic. Now factor in the fact, that Fargate is at least twice as expensive as EC2 and it's suddenly cheaper to keep doing it manually.

This could be solved by allowing to assign a public IP to the scheduled task, the same way it is allowed when launching a task manually.

Screenshot from 2019-06-10 15-28-59

Are you currently working around this issue? I start manually EC2 instances, then my task, after the task finishes I kill the EC2 instances.

I'm also considering using lambda, that would start the task with enabled public IP, and trigger this lambda using CloudWatch timer. It's a hassle, because I'll have to do it indirectly, instead of just configuring the scheduled task.

FernandoMiguel commented 5 years ago

everything seems to be working as expected. the security and network perimeter are as Well Architected.

fprochazka commented 5 years ago

@FernandoMiguel that is no very helpful

FernandoMiguel commented 5 years ago

you can use a a peered ECR so it doesnt need public internet access to download images from ECR

fprochazka commented 5 years ago

@FernandoMiguel that solves only half of my problem

FernandoMiguel commented 5 years ago

@fprochazka

But I don't understand why did you create it this way.

from: https://aws.amazon.com/solutionspace/networking/

Amazon Web Services (AWS) provides the Networking tools and resources that enable you to securely connect to the cloud and then isolate, control, and distribute your applications across EC2 compute resources and all other relevant services in AWS. Networking Solutions available from AWS Partner Network (APN) partners can help you establish your secure, scalable, cost-effective cloud presence more rapidly. Whether you are planning to migrate to AWS or are looking to expand your established network capabilities on the cloud, there are readily available tools and resources at your disposal to accelerate the realization of your goals.

FernandoMiguel commented 5 years ago

or even https://aws.amazon.com/vpc/

Amazon Virtual Private Cloud (Amazon VPC) lets you provision a logically isolated section of the AWS Cloud where you can launch AWS resources in a virtual network that you define. You have complete control over your virtual networking environment, including selection of your own IP address range, creation of subnets, and configuration of route tables and network gateways. You can use both IPv4 and IPv6 in your VPC for secure and easy access to resources and applications. You can easily customize the network configuration for your Amazon VPC. For example, you can create a public-facing subnet for your web servers that has access to the Internet, and place your backend systems such as databases or application servers in a private-facing subnet with no Internet access. You can leverage multiple layers of security, including security groups and network access control lists, to help control access to Amazon EC2 instances in each subnet.

fprochazka commented 5 years ago

@FernandoMiguel that is not helpful

FernandoMiguel commented 5 years ago

@fprochazka you said you didnt understand how a VPC works. I'm giving you resources to learn more. Do you have any specific thing you dont understand I can help with?

fprochazka commented 5 years ago

@FernandoMiguel please read the whole post, not just some parts

I've read the documentation and I understand why this doesn't work.

My main problem is that NAT Gateway is expensive and therefore I don't want to use it. I want to be able to just start the container with public IP and be done with it. But AWS Console does not allow that.

FernandoMiguel commented 5 years ago

I want to be able to just start the container with public IP and be done with it. But AWS Console does not allow that.

all you have to do is attach a EP to your ENI.

FernandoMiguel commented 5 years ago

Or you can run your own NAT egress service too

fprochazka commented 5 years ago

all you have to do is attach a EP to your ENI.

Now this is something I'm interested in! What is EP, and where do I configure it please?

FernandoMiguel commented 5 years ago

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/elastic-ip-addresses-eip.html

FernandoMiguel commented 5 years ago

Is that a private subnet? Try creating a Public one and make sure you select “auto-assign public ipv4” check box

fprochazka commented 5 years ago

EIP is ~$4/month, that is acceptable.

But if I understand it correctly, the ENI is created after the instance starts and is removed after it terminates. Which means it would be very hard to automate with Fargate and Scheduled Tasks.

What am I missing?

FernandoMiguel commented 5 years ago

EIP is free while in use you pay while you dont have it attached to a working instance.

FernandoMiguel commented 5 years ago

why dont you put your cluster in a public VPC instead of the private one ? then you set the subnet to assign a public ip, and all your issues go away

fprochazka commented 5 years ago

@FernandoMiguel I didn't know there is "a public vpc".

There is nothing in AWS Console in VPC service, that mentions how to create a public VPC, and in ECS create cluster, there is no mention of public VPC and when creating Scheduled Task, there is also no mention of public VPC.

FernandoMiguel commented 5 years ago

i meant to say a public subnet, not vpc

fprochazka commented 5 years ago

Hmm, I've just spend some time reading tutorials on public subnets and I've realized I don't want to mess with this, because there is a high chance I'll break my default VPC and I have zero motivation to create a new VPC and migrate everything in it.

I've decided it's safer for me to trigger the ECS Fargate task using Lambda.

fprochazka commented 5 years ago

Turns out, it's really easy to start the task using AWS Lambda and there is 'assignPublicIp': 'ENABLED' available, which means I can now create "pseudo" Scheduled Task on Fargate with public IP, which means internet is working

deleugpn commented 5 years ago

It's very natural for a single VPC to have public and private subsets. Theres no problem in having both of them in your default VPC. The important factor is to be conscious about what you put and where. A service in a private subnet will not have internet access and will not be accessible from the internet. This is the most protective you can get. Theoretically speaking, even if you run an outdated OS with security vulnerabilities, your service is still unreachable from outside the VPC and has a lower chance of being exploited. Note that I'm not suggesting you or anyone do this, but as a matter of understanding how private subnets have security at a high standard. A public VPC will expose your service with a public ip address, meaning it has access to the internet as well as it can be accessed by someone else on the internet. In this scenario, Security Groups (firewall rules) are extremely important to keep your exposure to a minimum. If you have a database (RDS) with public accessibility turned off, you cannot connect to it from your machine as well as a container or lambda on a public subnet. If your RDS has public accessibility turned on, you and your container on public subnet can access it. See that having an RDS without public accessibility means that only services running on private subnet will be able to connect to it. Services inside your VPC and that you chose to give a private subnet. Having an RDS publicly accessible means someone can try and attach your database and exploit vulnerabilities to get into your data much more easily than having your database without internet connectivity at all.

ultimagriever commented 5 years ago

Here's how networking in AWS works:

VPC

A VPC is like your own private network in the cloud. You assign an IP range to the VPC (from /16 to /28) and every instance (not only EC2) that's provisioned inside this VPC will be allocated an IP address within that range, and communication between these instances by their respective private IP addresses will not go through the internet. Resources of services like EC2, ECS, RDS, ElastiCache, Redshift or any other service that requires an IP address must be provisioned inside a VPC.

In a VPC, we have the following components:

Subnets

Subnets are subsets of the IP range within a VPC. Each subnet corresponds to one Availability Zone within the region the VPC was created in. Each subnet must be assigned to at least one route table.

Route tables

The route tables determine where traffic to certain IP ranges go. By default, all route tables contain at least one local route pointing to the VPC's IP range. This is where you determine whether resources within a subnet have internet access or not, and whether resources within this subnet are accessible from the internet. This can be accomplished by routing requests to 0.0.0.0/0 through either an Internet Gateway, or a NAT Gateway.

Subnets associated to route tables with a route pointing to an Internet Gateway, therefore making the resources within it internet-accessible, are commonly referred to as public subnets. On the other hand, subnets associated to route tables without such a route, or with a route pointing to a NAT Gateway, thus rendering the resources inaccessible from the internet, are referred to as private subnets.

Internet Gateway

The Internet Gateway provides internet access to the VPC. In order for resources to have internet access and be accessible from the internet, they must be placed in a subnet with a route pointing traffic to 0.0.0.0/0 to an Internet Gateway. There can be at most 1 Internet Gateway per VPC.

NAT Gateway

The NAT Gateways performs network address translation for resources within your VPC. In order to grant resources internet access while not being accessible from the internet themselves, you must place a NAT Gateway in a subnet with a route to the Internet Gateway, then place the resources in another subnet with a route pointing traffic to 0.0.0.0/0 to the NAT Gateway. This is also recommended when you have a fleet of instances or containers that need to access a service that restricts access by IP, since all outgoing traffic is identified by the NAT Gateway's Elastic IP address (public IP).

Elastic IP

When auto-setting a public IP address to an instance or to a container, the IP can change if the machine reboots or goes through an outage. Elastic IP addresses are public IP addresses that are reserved to your account for allocation to your resources. They cost $0.005 per hour as long as they are not allocated to a running instance or container.

Security Groups

Security groups are like resource-level firewalls. In them, you define what type of incoming traffic can hit your resources, to which port(s) and from where, e.g. allow TCP 80 from 0.0.0.0/0, as well as outgoing traffic from your resources, to which port and to where.

By default, all incoming traffic is denied until you explicitly allow. You can remove the default outgoing traffic rule that allows all outgoing traffic as well if you would like to restrict that.

How this applies to ECS

Both ECS-EC2 and ECS-Fargate require the designation of a VPC: ECS-EC2 so that it can place the EC2 instances within the cluster in the VPC; and ECS-Fargate leverages the VPC for its networking in itself.

In order for containers to have internet access, they must be placed in a public subnet, as described above, or a private subnet with a route pointing to a NAT Gateway. If the containers must be accessible from the internet (as in a webservice), they must either:

Since, in your case, you want to run scheduled tasks, I would just place the containers in a public subnet and auto-assign a public IP address. If you don't open any ports in the security group, then they will be inaccessible anyway.

Optionally, if you have anything in private subnets that access other AWS resources, you can take a look at VPC Endpoints and Interfaces, which allow you to bypass the internet for that, therefore forgoing NAT Gateway traffic charges.

fprochazka commented 5 years ago

Thank you! This clears up many things for me! I think I now kind-of understand how this works on a high level.

But honestly, I'm afraid to tinker with the creation of the public subnets and their auto-assign of the public IP. It's a pity that the default VPC doesn't have this configured out-of-the box in a fresh AWS account.

ultimagriever commented 5 years ago

Actually, each region comes with a default VPC with a /20 public subnet in each Availability Zone. If you're only running scheduled tasks that are bound to finish (i.e. not serving anything), then you shouldn't worry about just auto-assigning a public IP address. Nobody will be able to access anything if you just place the containers in the default security group.

fprochazka commented 5 years ago

I'm not sure what are the interesting parts, but this is how the default looks in my account and none of the subnets are "public" in the sense, that the Fargate scheduled task has access to the internet (to send request, not expose ports)

Screenshot from 2019-06-13 17-03-02 Screenshot from 2019-06-13 17-08-45 Screenshot from 2019-06-13 17-04-27 Screenshot from 2019-06-13 17-04-17

Lanayx commented 4 years ago

I'm creating ScheduledFargateTask with CDK that runs twice a day, but I'm paying for NATs for 24 hours each, which is very undesirable. Is there a way to create those NATs right before the task runs and remove them once the task completes?