Closed mengwanguc closed 2 years ago
Never mind it should be aliyun.com/gpu-count
Hello,
Did you manage to make it work with aliyun.com/gpu-count? Because from #10 it seems not possible. I tried to do the same thing but I can't make it work. Do you have any advice?
Thank you very much,
Michele
Hello,
Did you manage to make it work with aliyun.com/gpu-count? Because from #10 it seems not possible. I tried to do the same thing but I can't make it work. Do you have any advice?
Thank you very much,
Michele
I don't think it's possible by using their tool.
Their allocation is based on memory value only: https://github.com/mengwanguc/gpushare-scheduler-extender/blob/878b6fde505d6e11fd15ef4fdcc67091b6d6323a/pkg/cache/nodeinfo.go#L147
If we want to make it possible, we must hack their code for our needs.
Let's say we have job A and B. And suppose we have two GPUs, each has 16GB memory.
Is it possible that we let job A occupy both GPUs, each 5GB memory. And let job B occupy both GPUs, each 10GB, memory?
I think in the configuration file, we can specify "aliyun.com/gpu-mem: 5" But how do we specify the gpu #?
Thanks, Meng