aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.22k stars 321 forks source link

AWS Fargate GPU Support: When is GPU support coming to fargate? #88

Open mbnr85 opened 5 years ago

mbnr85 commented 5 years ago

Tell us about your request What do you want us to build?

Which service(s) is this request for? This could be Fargate, ECS, EKS, ECR

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? What outcome are you trying to achieve, ultimately, and why is it hard/impossible to do right now? What is the impact of not having this problem solved? The more details you can provide, the better we'll be able to understand and solve the problem.

Are you currently working around this issue? How are you currently solving this problem?

Additional context Anything else we should know?

Attachments If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)

FelixRelli commented 2 years ago

Using AWS Batch to run on demand workloads using GPU instances might be a temporary workaround until fargate support is there.

hervenivon commented 2 years ago

Using AWS Batch to run on demand workloads using GPU instances might be a temporary workaround until fargate support is there.

Be careful, latency can be high with AWS Batch

ofirnk commented 2 years ago

Is there some quiet thread to subscribe to? So that when there's an official update we'll get

nselvakkumaran commented 1 year ago

Using AWS Batch to run on demand workloads using GPU instances might be a temporary workaround until fargate support is there.

AWS Batch recommends using EC2 when a GPU is required. (i.e. no different from ECS task plus EC2 machines for capacity). https://docs.aws.amazon.com/batch/latest/userguide/fargate.html

Fargate needs to natively support GPU for realizing the goal of Serverless ML inferencing (most deep learning tasks).

achaiah commented 1 year ago

Really? No GPU support in 2023?

genifycom commented 1 year ago

Crazy isn't it. We need the GPU compute power in a serverless form. We needed it several years ago!

RogerSanders commented 1 year ago

Adding my +1 here to say this is something I also want to be able to do. When developing rich cloud-based applications, we have a diverse mix of plain CPU and GPU-accelerated compute tasks to execute, which are spun up on-demand by user actions. From our perspective, we just want to be able to launch all these through Fargate, not shunt the tasks that happen to be GPU accelerated through an entirely different pipeline.

miolini commented 1 year ago

I bet it will be released after the AGI was created event. :-)

On Sun, Mar 12, 2023 at 4:48 PM Roger Sanders @.***> wrote:

Adding my +1 here to say this is something I also want to be able to do. When developing rich cloud-based applications, we have a diverse mix of plain CPU and GPU-accelerated compute tasks to execute, which are spun up on-demand by user actions. From our perspective, we just want to be able to launch all these through Fargate, not shunt the tasks that happen to be GPU accelerated through an entirely different pipeline.

— Reply to this email directly, view it on GitHub https://github.com/aws/containers-roadmap/issues/88#issuecomment-1465333374, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADKERF6WFYJCMM2G6TFJD3W3ZOEVANCNFSM4GNEMSCA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

MLaunch commented 1 year ago

We likewise have a similar usecase that would benefit from GPU Enabled Fargate and/or GPU-Enabled Lambdas.

as-polyakov commented 1 year ago

We (DataRobot) are also interested in GPU support for Fargate. Leaving my +1 here

craigjbass commented 1 year ago

Public sector could really benefit from GPU fargate for isolated LLMs

ichenjia commented 1 year ago

Any update on this?

oscarnevarezleal commented 1 year ago

Omg!! every ~1 year or so I fall into the same rabbit hole because I completely forgot about this limitation until I find this entry and then all come back to me. Still surprising that this request is 4 years old already.

przemekblasiak commented 1 year ago

+1

karkir0003 commented 1 year ago

Any update on this? This would be super convenient!

karkir0003 commented 1 year ago

@PettitWesley, @abby-fuller can you please comment on this issue ASAP!

SuitoYoshiki commented 1 year ago

+1

silvan02 commented 1 year ago

+1

amztc34283 commented 1 year ago

This paper explains the challenge of managing serverless GPU service and proposes potential solution: https://arxiv.org/pdf/2303.05601.pdf#:~:text=For%20example%2C%20AWS%20Lambda%20%5B3,they%20cannot%20share%20the%20GPUs.

ekaj2 commented 1 year ago

The importance of this is higher now than ever before :)

BradVidler commented 1 year ago

+1 very interested in this. Having to use EC2 instances that take 5-10 minutes to start is really unfortunate for our use case. There are competitors that can run instantly but our entire ecosystem is AWS based.

sd031 commented 1 year ago

Eagerly waiting for this!

buechera commented 1 year ago

+1. Another team (AWS Panorama) with an interest in fargate GPU instances for our build/test pipelines.

patradinesh commented 1 year ago

+1

jwstegemann commented 1 year ago

+1

DannySmrt commented 1 year ago

+1

bianchiidentity commented 1 year ago

The current best practice is to launch GPU instances by executing ECS from lambda, right?

killmepete commented 1 year ago

+1

JacekKosciesza commented 1 year ago

+1 Use case: running panoramic image stitching software

whillas-yabble commented 1 year ago

just google "serverless GPU". There are lots of cheap third-party providers now that make it very easy to spin up instances and pay for what you use. Generally AWS is way behind when it comes to AI.

simjak commented 11 months ago

The current best practice is to launch GPU instances by executing ECS from lambda, right?

Can you give a reference for this?

raheem-imcapsule commented 10 months ago

Any update on this. Issue 5th anniversary completed successfully 🥳 🎈

nonedone commented 10 months ago

Adding my nudge here on this historic ticket.

backnight commented 10 months ago

+1

dscain commented 10 months ago

For simple Generative AI workloads, fargate+gpu would be really valuable, even more now with so many companies working gen AI features.

oscarnevarezleal commented 10 months ago

The reality is there's not enough GPU power available for on demand scenarios. Even more, the available supply goes directly to big players.

I just opened some AWS accounts for new clients and they start with a zero EC2 GPU quota, have to increasing by submitting an increment request and even after that they don't give you all the availability you ask for rather they told you to wait and see if you really needed more.

Looking at the trends over the years GPU will need to become either absurdly cheap or widely available (same thing must of the times ) before we can go on demand mode for later to be IaC available.

rejochandran commented 8 months ago

1911 days and counting... would love to have this; could be a game changer 🔥

pepitoenpeligro commented 7 months ago

Hi team, this feature can be very interesting for certain types of inferences, especially considering the weight of ML and IA in general on the overall AWS path. Thank you so much team :)

kunal14053 commented 7 months ago

Do we have any updates on this?

genifycom commented 7 months ago

Sadly Amazon are letting us down on the AI/ML front by not giving us the flexibility we need to advance. As a result we are falling behind where we should be at this point.

The model seems to be that cloud providers know best and will force us down their path.

We need GPU access in multiple scenarios.

kmulka-bloomberg commented 7 months ago

Custom Model Import was announced for Bedrock the other day. Not sure of all your use cases, but might be an option for a managed AI model hosting with pay-per-token pricing. https://aws.amazon.com/about-aws/whats-new/2024/04/custom-model-import-amazon-bedrock/

FelixRelli commented 7 months ago

Custom Model Import was announced for Bedrock the other day. Not sure of all your use cases, but might be an option for a managed AI model hosting with pay-per-token pricing. https://aws.amazon.com/about-aws/whats-new/2024/04/custom-model-import-amazon-bedrock/

"Custom models can only be accessed using Provisioned Throughput." => So no pay-per token. https://aws.amazon.com/bedrock/pricing/

kmulka-bloomberg commented 7 months ago

"Custom models can only be accessed using Provisioned Throughput." => So no pay-per token. https://aws.amazon.com/bedrock/pricing/

I'm working on getting clarification from AWS, but this blog post says that custom model import uses the On-Demand mode. https://aws.amazon.com/blogs/aws/import-custom-models-in-amazon-bedrock-preview/

tamreddy-dot commented 5 months ago

.

johnwheeler commented 3 months ago

What is gpu_count for on FargateTaskDefinition?

ssignal commented 3 months ago

@johnwheeler I think the error message and the below link will help you. "Resource handler returned message: "Invalid request provided: Create TaskDefinition: Tasks using the Fargate launch type do not support GPU resource requirements."

https://nocd.hashnode.dev/registering-gpu-instance-w-aws-elastic-container-service-ecs

kendrexs commented 1 month ago

This is WIP for ECS (edited)

kendrexs commented 1 week ago

updated comment to clarify the feature for GPUs and other advanced capacity