AliyunContainerService / gpushare-scheduler-extender

GPU Sharing Scheduler for Kubernetes Cluster
Apache License 2.0
1.36k stars 303 forks source link

k8s上安装好插件,无法识别到集群GPU资源 #226

Open ferris-cx opened 1 month ago

ferris-cx commented 1 month ago

k8s上安装好插件,pod调度成功,处于running状态,执行kubectl inspect gpushare获取GPU资源为0,调度一个推理服务提示:0/2 nodes are available: 2 Insufficient aliyun.com/gpu-mem. preemption: 0/2 nodes are available: 2 No preemption victims found for incoming pod

yashiang1986 commented 1 month ago

你需要查看gpushare-schd-extender的log以獲取具體的錯誤訊息.