Closed abravalheri closed 3 months ago
GPU Feature Discovery, a daemonset deployed by GPU Operator, will label your GPU nodes with GPU related information. Describe you GPU nodes and you will see a number of labels with the nvidia.com/
prefix. You can leverage these labels as node selectors in your pod spec to better control where your pod gets scheduled to. I would recommend using the nvidia.com/gpu.product
label for your use case.
Thank you! That seems to work.
Hello I was trying to follow the documentation in https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/gpu-sharing.html#applying-multiple-node-specific-configurations and figure out how to deploy workloads on specific GPUs.
For example, let's assume that I have followed the docs using the following settings:
How to I write my pod specification so that I gets scheduled to use the
tesla-t4
instead of thea100-40gb
? If I simply writeresources: {limits: {nvidia.com/gpu: 1}}
the pod will get scheduled for any of the GPUs that is available right? How can I specify which one I want to use?As a wild guess, I tried using a different name for the resource (e.g.
nvidia.com/t4-ts4
), but it did not seem to work. So I imagine there is a different mechanism for that...Is there any documentation that explains how to achieve that?