Closed Michaelvll closed 2 weeks ago
This issue is stale because it has been open 120 days with no activity. Remove stale label or comment or this will be closed in 10 days.
This issue was closed because it has been stalled for 10 days with no activity.
We are also interested in using SkyPilot on AMD GPUs.
What changes are necessary to support this?
Hi @deke997 - what kind of cluster do you have? Does it run any orchestration layer, such as k8s?
We have a PoC PR for AMD on Kubernetes clusters here - https://github.com/skypilot-org/skypilot/pull/3209
Let us know what you think!
This issue is stale because it has been open 120 days with no activity. Remove stale label or comment or this will be closed in 10 days.
This issue was closed because it has been stalled for 10 days with no activity.
We should consider adding support for AMD GPUs, which have been tested to be efficient for ML workloads.
References: https://www.amd.com/en/technologies/deep-machine-learning https://www.lamini.ai/blog/lamini-amd-paving-the-road-to-gpu-rich-enterprise-llms https://blog.mlc.ai/2023/08/09/Making-AMD-GPUs-competitive-for-LLM-inference https://www.mosaicml.com/blog/amd-mi250