aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.22k stars 321 forks source link

[ECS] [Volumes]: Persistent volumes in EBS #64

Closed nunofernandes closed 10 months ago

nunofernandes commented 5 years ago

Target:

Allow the possibility to allocate an existing EBS volume to a container in AWS ECS Optimized AMI without third party components.

Automatic EBS migration of the volume when the container starts on another AZ would be a nice feature.

talawahtech commented 5 years ago

It seems you guys are working on a CSI driver for EBS[1]. If ECS added support for CSI as well that would be awesome.

The other options that I am aware of are the rexray and cloudstor docker volume plugins, but both of those have issues with the latest generation nitro instances. Future development of those plugins also seems uncertain.

  1. https://github.com/kubernetes-sigs/aws-ebs-csi-driver
Akramio commented 5 years ago

Thanks everyone for this request. It would really awesome if you could give us a little more detail about your need for this feature: For example, which workloads / applications that require EBS would you want to deploy on ECS? How would you imagine this working on ECS in an ideal scenario?

nunofernandes commented 5 years ago

Some workloads do require filesystem persistence (for example wordpress, django sites, some web apps). It is clear that EFS could be a solution for that but EFS is not, as of yet, available on all regions and EBS is.

This working on ECS would allow a more broad set of applications to be installed on ECS containers as persistence could be achieved using EBS volumes.

talawahtech commented 5 years ago

My two main use cases are:

1) Applications where block level storage is strongly recommended e.g Postgres, mongodb, 3rd party apps

2) Being able to use EBS's snapshot feature to create a consistent backup of the filesystem, as well as being able to intitialize the filesystem from an existing snapshot

recursivefunk commented 5 years ago

We're currently running workloads which need to do perform local filesystem data calculations. Since we can't mount EBS/EFS, we're forced to download data explicitly all from S3 to the filesystem at boot time. The amount of data we have to download increases linearly with the amount of scale our customers require. This is causing our boot time to increase significantly. By switching to EBS, we would "skip" the data download step and take advantage of lazy data loading from S3 and perhaps do some upfront cache warming.

FernandoMiguel commented 5 years ago

@recursivefunk that one can be easily fixed by having a storage Docker that you mount a volume from in the other containers

recursivefunk commented 5 years ago

@FernandoMiguel interesting. If I understand correctly, there would still need to be a download step to load the data in a volume, but it would only happen once, and other ECS tasks could point to it. Not quite the instant S3 <-> Filesystem sync, but certainly a huge improvement. I didn't see anything about caching, either. I don't want to derail this thread 😅, happy to continue this convo elsewhere if you're up to it!

FernandoMiguel commented 5 years ago

@recursivefunk ping me on one of the many slacks about aws (i'm in most) as Fernando (case sensitive :) ) i'll show you examples.

lotjuh commented 5 years ago

We need this as well. Our usecase: We're running Hybris in dockers and want to speed up the startup time of the dev and test environments. During the build process, we want to spin up a temporary stack which will run the hybris URS and runs the SOLR index jobs so all data is available for the current build. We will then take a snapshot from both the SOLR and MySQL EBS disks and tag them with the build number.

When we start up the dev or test environment, we want to be able to provide these snapshots based on the required build nr, which will be used to create a persistent EBS volume for the MySQL docker and the Solr docker. This should drastically speed up the startup times of our environments. We used to run SOLR with an EFS backed mount for the cores, but unfortunately we ran into issues with the maximum file locks. The amount of cores we have in SOLR are too many to be able to host it on EFS.

Akramio commented 5 years ago

Thanks everyone for your feedback and use-cases.

@nunofernandes thanks for your use-case. Can you provide a little more detail on why you would like to use "an existing EBS volume" (vs having EBS dynamically create a new volume and attach it, potentially based on a snapshot). For example, is this because you would like to migrate a pre-existing workload from EC2?

@recursivefunk , understood. In this case, the EBS volume lifecycle is tied to that specific Task: it is created specifically for that Task, and deleted when the Task dies.

@lotjuh Am I right in saying that this sounds like a read-only use-case, where like in @recursivefunk's use-case, the EBS volume lifecycle is tied to a specific Task (it is created specifically for that Task, and deleted when the Task dies.)? Do you currently deploy all your builds with the same Task Definition? Or do you create a separate Task Definition for each build number?

Akramio commented 5 years ago

I have created issue #127 to collect feedback for stateful services. Please provide +1s and use-cases in there if you believe your workload will require each task to have a unique identifier.

nunofernandes commented 5 years ago

@Akramio it can be an EBS dynamically created volume. I don't need it to be an existing one.

lotjuh commented 5 years ago

@Akramio This is not a read-only use-case. When starting the ECS task, it will need a persistent volume to store the data. Both Solr and MySQL will need to be able to write to it as well. It will also need to persist if the task dies so it can be attached again to the new task that comes in it's place and no data will be lost. We do use separate task definitions per build which need to be pointing to separate snapshots to be used for the persistent EBS volume.

So what we need is the option to define a snapshot ID as part of the ECS service/task which will then be used to create a persistent EBS volume which will be mounted to our docker.

I hope this clarifies it

Akramio commented 5 years ago

@recursivefunk Just for me to understand better: In your scenario above are you assuming that ECS is using a pre-provisioned EBS volume that you have already created for each Task? Or that ECS is creating an EBS volume 'on the fly' when the Task is launched? If the latter, the download time is replaced with the time it takes to create and attach an EBS volume based on a snapshot.

recursivefunk commented 5 years ago

@Akramio Ideally, the ECS tasks would use a pre-provisioned EBS volume. We'd introduce a data processing step which creates the volume and subsequently-launched tasks would "discover" it. Hope that makes sense!

Akramio commented 5 years ago

@lotjuh and @nunofernandes would you be interested in running some of these stateful workloads you mentioned (Solr, MySQL, CMS) on Fargate?

On Fargate, you would likely not be able to run privileged containers or tune-in kernel-level parameters.

nunofernandes commented 5 years ago

@Akramio I would love to also have that feature on Fargate, but unfortunately Fargate is not yet available on the eu-west-3 region (my main region). For now I would be happy with regular ECS support.

lotjuh commented 5 years ago

Having it available on fargate is not the main priority for us either. But if it is possible on fargate as well we'll most certainly use it.

jtatum commented 5 years ago

EBS (and EFS) on Fargate are definitely priorities in our organization. We use both via EC2 instances today to handle containerized versions of various COTS apps such as Gitlab and Nexus. The automation needed to support this is clumsy and we're moving all of our stateless containers to Fargate - definitely looking forward to the day when we can do that for stateful containers as well.

dreamingbinary commented 4 years ago

Bump for 2019!

ztane commented 4 years ago

And bump for 2020 :(

tstibbs commented 4 years ago

Another use-case I don't see mentioned here is using fargate for isolated builds.

We use fargate tasks to run CI builds each time a developer pushes to a branch. This gives us a consistent environment every time we run a build, and provides great isolation between builds (because the container is destroyed at the end of the build), without having to worry about a fleet of EC2s running the containers. However some of our larger integration tests need more than 20Gb of disk space. EFS does not provide the bandwidth required for IO-intensive processes like spark, so the only options I can see is to attach an EBS volume to the task in order to provide fast-enough and large-enough storage. This storage could die along with the task though, there is no need for persistence.

JohnPreston commented 3 years ago

Another use-case that I have come across recently is for Fargate too (although applies to otherwise deployed on EC2 nodes) which is for Kafka related workloads which use RocksDB and being able to have RocksDB flush from RAM to disk.

Given that EKS (non Fargate) gets a whole driver for this, I just cannot understand how ECS is not getting as much love in getting similar features.

Yes, ECS and EKS aren't to be the same otherwise why have the two, but it just is odd to see such drive for the one and not for the other, especially on the one for which you (AWS) do not rely on a community to come up with patches, fixes and such.

Massive bump request for 2021!!!

afretwell commented 2 years ago

I would love to see this for my projects, this would simplify a lot of my deployments and avoid a lot of custom logic.

vibhav-ag commented 10 months ago

This is launched now: https://aws.amazon.com/about-aws/whats-new/2024/01/amazon-ecs-fargate-integrate-ebs/

Dudssource commented 10 months ago

This is launched now: https://aws.amazon.com/about-aws/whats-new/2024/01/amazon-ecs-fargate-integrate-ebs/

That's awesome! Do we have, by any chance, the estimated date that this feature will be released to more regions (like sa-east-1)? Thanks!

jtatum commented 10 months ago

Is this feature request actually resolved? Since the volumes aren't, you know, persistent.

From the docs:

Volumes that are attached to tasks that are managed by a service are not preserved and are always deleted upon task termination.

I'd expect a persistent volume to outlive the duration of a task.

gunzy83 commented 10 months ago

Is this feature request actually resolved? Since the volumes aren't, you know, persistent.

From the docs:

Volumes that are attached to tasks that are managed by a service are not preserved and are always deleted upon task termination.

I'd expect a persistent volume to outlive the duration of a task.

Wow, this is utterly ridiculous. This issue is definitely not resolved.

vibhav-ag commented 10 months ago

@gunzy83 @jtatum We understand this release doesn't address all the use cases called out on this issue. We're using this issue to track use cases that require an EBS volume to be reattached to tasks managed by a service when the task is terminated or redeployed. Please add your use case to the issue if you haven't already.

vibhav-ag commented 10 months ago

This is launched now: https://aws.amazon.com/about-aws/whats-new/2024/01/amazon-ecs-fargate-integrate-ebs/

That's awesome! Do we have, by any chance, the estimated date that this feature will be released to more regions (like sa-east-1)? Thanks!

This will be coming out soon in additional regions.

vibhav-ag commented 10 months ago

@Dudssource this is now available in most commercial AWS regions (including sa-east-1). See documentation for details: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ebs-volumes.html

jtatum commented 10 months ago

@gunzy83 @jtatum We understand this release doesn't address all the use cases called out on this issue. We're using this issue to track use cases that require an EBS volume to be reattached to tasks managed by a service when the task is terminated or redeployed. Please add your use case to the issue if you haven't already.

It's a little more than use cases called out. The issue summary says "persistent volumes". The volumes delivered are explicitly not persistent.

Dudssource commented 10 months ago

@Dudssource this is now available in most commercial AWS regions (including sa-east-1). See documentation for details: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ebs-volumes.html

That's awesome @vibhav-ag ! One of our current use cases requires low latency and high throughput IO, this feature will certainly help us a lot. Thanks again!

rajivpatki commented 9 months ago

The current setup does not allow for existing EBS multi-attach volumes to be attached to Fargate tasks on ECS. Using EFS is an option but EFS is extremely expensive when your tasks perform repeated read operations.

Are there any plans on integrating Multi-Attach volumes with ECS tasks?

AlliotTech commented 7 months ago

The current setup does not allow for existing EBS multi-attach volumes to be attached to Fargate tasks on ECS. Using EFS is an option but EFS is extremely expensive when your tasks perform repeated read operations.

Up until now, Fargate still does not allow the reuse of existing EBS.