Open marcindulak opened 1 year ago
Wow. I was just thinking how needed this is and happy to see the issue has already been opened. My case is that I have AKS configured in Central US and East US2. We just updated one of our Python calculation pipelines that uses Jax to support GPU. We were able to get capacity in East US2 for NCASv3_T4 (we don't need a ton of memory for our calc so others are overkill), but MS support says there are no NCASv3_T4 available in Central US.
Now we are faced with the decision, do we setup another difficult to maintain AKS cluster in South Central US where we were able to get NCASv3_T4 capacity or just let the calcs run slower on CPU nodes in Central US when our East US2 cluster can't handle all the load.
I was hoping there was a third option, just deploy our containers to Azure Container Apps in South Central US to run on GPU-enabled hardware. Looking forward to how much this will save us in maintenance spend.
Is there anything on the roadmap for Azure Container Apps regarding Gpu support? We love the service itself but are forced to switch to Azure Container Instances for now.
I was also interested in GPU support for Container Apps. I recently saw this blog post which says they support dedicated workloads with A100 GPUs so I think this is supported now, although I haven't directly tested it myself
You can now create Azure Container Apps environments with NC A100 v4 GPU enabled compute in the West US 3 and North Europe Azure regions
It also seems like they are trying to unify the Consumption and Workload modes so that Workload can have scale to 0 experience. I will likely try it out soon.
Yes, Microsoft released preview support for GPU Container Apps in Ignite 2023. The cluster sizes start from 24 nodes, only A100 GPUs are supported and you have to make an application for Microsoft if you want to be included in the preview. Hopefully they will open up the preview soon and offer smaller clusters!
Where does general availability of GPU sit on the roadmap and overall priority list? Seems like it's been in preview in a limited number of regions for quite some time now. Even if it stays in preview, can the number of regions be expanded. Specifically, we are interested in a US East option.
Is your feature request related to a problem? Please describe.
GPU support for container based workloads, predictions using small machine learning models, e.g. a few GPUs and few dozens GBs of GPU memory or their fractions.
Describe the solution you'd like.
GPU with auto-scaling support, with support for scaling to zero.
Describe alternatives you've considered.
Additional context.
None