Open mbnr85 opened 5 years ago
Using AWS Batch to run on demand workloads using GPU instances might be a temporary workaround until fargate support is there.
Using AWS Batch to run on demand workloads using GPU instances might be a temporary workaround until fargate support is there.
Be careful, latency can be high with AWS Batch
Is there some quiet thread to subscribe to? So that when there's an official update we'll get
Using AWS Batch to run on demand workloads using GPU instances might be a temporary workaround until fargate support is there.
AWS Batch recommends using EC2 when a GPU is required. (i.e. no different from ECS task plus EC2 machines for capacity). https://docs.aws.amazon.com/batch/latest/userguide/fargate.html
Fargate needs to natively support GPU for realizing the goal of Serverless ML inferencing (most deep learning tasks).
Really? No GPU support in 2023?
Crazy isn't it. We need the GPU compute power in a serverless form. We needed it several years ago!
Adding my +1 here to say this is something I also want to be able to do. When developing rich cloud-based applications, we have a diverse mix of plain CPU and GPU-accelerated compute tasks to execute, which are spun up on-demand by user actions. From our perspective, we just want to be able to launch all these through Fargate, not shunt the tasks that happen to be GPU accelerated through an entirely different pipeline.
I bet it will be released after the AGI was created event. :-)
On Sun, Mar 12, 2023 at 4:48 PM Roger Sanders @.***> wrote:
Adding my +1 here to say this is something I also want to be able to do. When developing rich cloud-based applications, we have a diverse mix of plain CPU and GPU-accelerated compute tasks to execute, which are spun up on-demand by user actions. From our perspective, we just want to be able to launch all these through Fargate, not shunt the tasks that happen to be GPU accelerated through an entirely different pipeline.
— Reply to this email directly, view it on GitHub https://github.com/aws/containers-roadmap/issues/88#issuecomment-1465333374, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADKERF6WFYJCMM2G6TFJD3W3ZOEVANCNFSM4GNEMSCA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
We likewise have a similar usecase that would benefit from GPU Enabled Fargate and/or GPU-Enabled Lambdas.
We (DataRobot) are also interested in GPU support for Fargate. Leaving my +1 here
Public sector could really benefit from GPU fargate for isolated LLMs
Any update on this?
Omg!! every ~1 year or so I fall into the same rabbit hole because I completely forgot about this limitation until I find this entry and then all come back to me. Still surprising that this request is 4 years old already.
+1
Any update on this? This would be super convenient!
@PettitWesley, @abby-fuller can you please comment on this issue ASAP!
+1
+1
This paper explains the challenge of managing serverless GPU service and proposes potential solution: https://arxiv.org/pdf/2303.05601.pdf#:~:text=For%20example%2C%20AWS%20Lambda%20%5B3,they%20cannot%20share%20the%20GPUs.
The importance of this is higher now than ever before :)
+1 very interested in this. Having to use EC2 instances that take 5-10 minutes to start is really unfortunate for our use case. There are competitors that can run instantly but our entire ecosystem is AWS based.
Eagerly waiting for this!
+1. Another team (AWS Panorama) with an interest in fargate GPU instances for our build/test pipelines.
+1
+1
+1
The current best practice is to launch GPU instances by executing ECS from lambda, right?
+1
+1 Use case: running panoramic image stitching software
just google "serverless GPU". There are lots of cheap third-party providers now that make it very easy to spin up instances and pay for what you use. Generally AWS is way behind when it comes to AI.
The current best practice is to launch GPU instances by executing ECS from lambda, right?
Can you give a reference for this?
Any update on this. Issue 5th anniversary completed successfully 🥳 🎈
Adding my nudge here on this historic ticket.
+1
For simple Generative AI workloads, fargate+gpu would be really valuable, even more now with so many companies working gen AI features.
The reality is there's not enough GPU power available for on demand scenarios. Even more, the available supply goes directly to big players.
I just opened some AWS accounts for new clients and they start with a zero EC2 GPU quota, have to increasing by submitting an increment request and even after that they don't give you all the availability you ask for rather they told you to wait and see if you really needed more.
Looking at the trends over the years GPU will need to become either absurdly cheap or widely available (same thing must of the times ) before we can go on demand mode for later to be IaC available.
1911 days and counting... would love to have this; could be a game changer 🔥
Hi team, this feature can be very interesting for certain types of inferences, especially considering the weight of ML and IA in general on the overall AWS path. Thank you so much team :)
Do we have any updates on this?
Sadly Amazon are letting us down on the AI/ML front by not giving us the flexibility we need to advance. As a result we are falling behind where we should be at this point.
The model seems to be that cloud providers know best and will force us down their path.
We need GPU access in multiple scenarios.
Custom Model Import was announced for Bedrock the other day. Not sure of all your use cases, but might be an option for a managed AI model hosting with pay-per-token pricing. https://aws.amazon.com/about-aws/whats-new/2024/04/custom-model-import-amazon-bedrock/
Custom Model Import was announced for Bedrock the other day. Not sure of all your use cases, but might be an option for a managed AI model hosting with pay-per-token pricing. https://aws.amazon.com/about-aws/whats-new/2024/04/custom-model-import-amazon-bedrock/
"Custom models can only be accessed using Provisioned Throughput." => So no pay-per token. https://aws.amazon.com/bedrock/pricing/
"Custom models can only be accessed using Provisioned Throughput." => So no pay-per token. https://aws.amazon.com/bedrock/pricing/
I'm working on getting clarification from AWS, but this blog post says that custom model import uses the On-Demand mode. https://aws.amazon.com/blogs/aws/import-custom-models-in-amazon-bedrock-preview/
.
What is gpu_count for on FargateTaskDefinition?
@johnwheeler I think the error message and the below link will help you. "Resource handler returned message: "Invalid request provided: Create TaskDefinition: Tasks using the Fargate launch type do not support GPU resource requirements."
https://nocd.hashnode.dev/registering-gpu-instance-w-aws-elastic-container-service-ecs
This is WIP for ECS (edited)
updated comment to clarify the feature for GPUs and other advanced capacity
Tell us about your request What do you want us to build?
Which service(s) is this request for? This could be Fargate, ECS, EKS, ECR
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? What outcome are you trying to achieve, ultimately, and why is it hard/impossible to do right now? What is the impact of not having this problem solved? The more details you can provide, the better we'll be able to understand and solve the problem.
Are you currently working around this issue? How are you currently solving this problem?
Additional context Anything else we should know?
Attachments If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)