Open itsham-sajid opened 6 months ago
Add a user story here: A team need to use spot GPU nodes for burst jobs, but the jobs come in intermittently. Currently, lack of GPU node image (for now, GPU driver needed to be auto installed after the node being deployed) cause the slow startup. If not considering allocation time, it will take 5 minutes for node to be started (and containered being started). This is harmful for FinOps consideration. Based on syslog, auto GPU driver installation will take around 1 min. And if that time can be cut, this will be good.
Is there any update on this? Even I want to use the custom image.
Describe the solution you'd like Allow users to use custom OS image version for AKS cluster nodes, similar to Amazon's custom Amazon Linux AMIs for Amazon EKS