Closed lengrongfu closed 2 months ago
If I think this case is needed by the community, I can contribute.
This is already supported using a custom config file for the device plugin.
The docs are a bit disorganized in that the instructions for how to supply a custom config file for the plugin are buried in the instructions for setting up time-slicing, but the content is still relevant: https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/23.9.2/gpu-sharing.html#configuration
Whatever setting you put for the migStrategy
will be set on the node when you associate it with that config.
Thanks for the documentation you provided. The following are the steps I took to implement the use case.
Step 1: create a migstrategy-config file, migstrategy-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: time-slicing-config-fine
data:
mixed: |-
version: v1
flags:
migStrategy: mixed
single: |-
version: v1
flags:
migStrategy: single
Step 2: Add the configmap
$ kubectl create -n gpu-operator -f migstrategy-config.yaml
Step 3: Configure the device plugin with the config map
$ kubectl patch clusterpolicies.nvidia.com/cluster-policy \
-n gpu-operator --type merge \
-p '{"spec": {"devicePlugin": {"config": {"name": "migstrategy-config"}}}}'
Step 4: Confirm that the gpu-feature-discovery and nvidia-device-plugin-daemonset pods restart.
Step 5: Apply a label to nodes one-by-one by specifying the node name
$ kubectl label node <node-name> nvidia.com/device-plugin.config=single
$ kubectl label node <node-name> nvidia.com/device-plugin.config=mixed
Closing as the requested feature is already supported.
I have a use case. We have a GPU cluster. Currently, the entire cluster using MIG can only have
single
ormixed
mode. I want to set some nodes tosingle
mode and some nodes tomixed
mode.