NVIDIA / k8s-dra-driver

Dynamic Resource Allocation (DRA) for NVIDIA GPUs in Kubernetes
Apache License 2.0
212 stars 38 forks source link

DRA does not support Tesla P4 model GPUs because it does not support setting time slices by nvidia-smi #41

Open wawa0210 opened 8 months ago

wawa0210 commented 8 months ago

When I ran DRA on the Tesla P4 node, I found that the pod failed to start.

environment

K8s version: v1.27.5 k8s-dra-driver: latest branch main

what happened

Deployment pod in Tesla P4 environment occupies one card and reports an error

E1221 04:43:21.238356       1 nvlib.go:489]
Failed to set timeslice policy with value Default for GPU 0 : Not Supported
Failed to set timeslice for requested devices : Not Supported
E1221 04:43:21.238522       1 nonblockinggrpcserver.go:127] "dra: handling request failed" err="error preparing devices for claim 5dc94ce6-1e6e-4359-bc73-3d2039797ff0: error setting up sharing: error setting timeslice for 5dc94ce6-1e6e-4359-bc73-3d2039797ff0: error setting time slice: error running nvidia-smi: exit status 3" requestID=2 request="&NodePrepareResourceRequest{Namespace:gpu-test1,ClaimUid:5dc94ce6-1e6e-4359-bc73-3d2039797ff0,ClaimName:pod1-gpu,ResourceHandle:,}"

dig found that when the GPU is set to not share, nvidia-smi compute-policy -i uuid --set-timeslice 0 will still be set, but Tesla P4 does not support this command, so an error is reported

root@nvidia-dcgm-exporter-pvgr8:/# nvidia-smi
Thu Dec 21 04:58:31 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla P4                       On  | 00000000:13:00.0 Off |                    0 |
| N/A   28C    P8               6W /  75W |      0MiB /  7680MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
root@nvidia-dcgm-exporter-pvgr8:/# nvidia-smi -L
GPU 0: Tesla P4 (UUID: GPU-e290caca-2f0c-9582-acab-67a142b61ffa)
root@nvidia-dcgm-exporter-pvgr8:/# nvidia-smi compute-policy -i GPU-e290caca-2f0c-9582-acab-67a142b61ffa --set-timeslice 0
Failed to set timeslice policy with value Default for GPU 0 : Not Supported
Failed to set timeslice for requested devices : Not Supported

code ref https://github.com/NVIDIA/k8s-dra-driver/blob/702a05b98b145bdd7e2beb8543fa24c69e2e7330/cmd/nvidia-dra-plugin/sharing.go#L99-L120

Steps to reproduce

Test yaml information

cat <<EOF | kubectl apply -f -

apiVersion: v1
kind: Namespace
metadata:
  name: gpu-test1

---
apiVersion: resource.k8s.io/v1alpha2
kind: ResourceClaimTemplate
metadata:
  namespace: gpu-test1
  name: gpu.nvidia.com
spec:
  spec:
    resourceClassName: gpu.nvidia.com

---
apiVersion: v1
kind: Pod
metadata:
  namespace: gpu-test1
  name: pod1
  labels:
    app: pod
spec:
  containers:
  - name: ctr
    image: chrstnhntschl/gpu_burn
    args: ["3600"]
    resources:
      claims:
      - name: gpu
  resourceClaims:
  - name: gpu
    source:
      resourceClaimTemplateName: gpu.nvidia.com
EOF

other information

NAS info

apiVersion: nas.gpu.resource.nvidia.com/v1alpha1
kind: NodeAllocationState
metadata:
  creationTimestamp: "2023-12-20T11:43:30Z"
  generation: 41
  name: 172-30-43-122
  namespace: nvidia-dra-driver
  ownerReferences:
  - apiVersion: v1
    kind: Node
    name: 172-30-43-122
    uid: 0e49e1e1-e0b5-4bfc-a89c-286262f6265d
  resourceVersion: "121350"
  uid: 5b600060-08d4-4525-95d5-02ec746e7c3c
spec:
  allocatableDevices:
  - gpu:
      architecture: Pascal
      brand: Tesla
      cudaComputeCapability: "6.1"
      index: 0
      memoryBytes: 8053063680
      migEnabled: false
      productName: Tesla P4
      uuid: GPU-e290caca-2f0c-9582-acab-67a142b61ffa
  allocatedClaims:
    5dc94ce6-1e6e-4359-bc73-3d2039797ff0:
      claimInfo:
        name: pod1-gpu
        namespace: gpu-test1
        uid: 5dc94ce6-1e6e-4359-bc73-3d2039797ff0
      gpu:
        devices:
        - uuid: GPU-e290caca-2f0c-9582-acab-67a142b61ffa
status: Ready

In this case, if sharing is not set, is it possible not to call the setTimeSlice method?

Looking forward to hearing from the community and then I can try to fix it

wawa0210 commented 7 months ago

@klueska @elezar friendly ping

elezar commented 7 months ago

Hi. Sorry @wawa0210.

We have been focussed on other development for the past couple of weeks.

It may make sense to not trigger the nvidia-smi call if sharing is not set. Would you be willing to create a PR with a proposal for us to review?

klueska commented 7 months ago

It's called everytime at the moment to ensure that when sharing is not set, that it gets set to the default time slice (in case it had been set to something else previously).

A better check might be to ensure that the architecture is Kepler+ before attempting to make the call.

wawa0210 commented 7 months ago

A better check might be to ensure that the architecture is Kepler+ before attempting to make the call.

It seems that no accurate documentation has been found describing which architectures support time slice settings,Is there accurate information available for reference?

wawa0210 commented 7 months ago

Hi. Sorry @wawa0210.

We have been focussed on other development for the past couple of weeks.

It may make sense to not trigger the nvidia-smi call if sharing is not set. Would you be willing to create a PR with a proposal for us to review?

okk