Project-HAMi / HAMi

Heterogeneous AI Computing Virtualization Middleware
http://project-hami.io/
Apache License 2.0
985 stars 200 forks source link

Gpuless deployments #469

Open sandwichdoge opened 2 months ago

sandwichdoge commented 2 months ago

Hello, I'd like to create a k8s deployment without GPUs, however my nvidia.com/gpu config doesn't work:

      limits:
        cpu: "2"
        memory: 4Gi
        nvidia.com/gpu: "0"
      requests:
        cpu: "2"
        memory: 4Gi
        nvidia.com/gpu: "0"

I confirmed that requesting 1 or more GPUs with a certain amount of VRAM works. But requesting "0" simply exposes all the GPUs running on that worker node to the running pod(s):

kubectl -n 0cd68651-0852-4cfa-9ebc-b2c42f02f746 exec -it nogpus-776f988c55-9cq84 nvidia-smi
Defaulted container "nogpus" out of: nogpus, init-chown-data (init)
Thu Sep  5 08:13:48 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07              Driver Version: 550.90.07      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A40                     Off |   00000000:00:10.0 Off |                  Off |
|  0%   23C    P8             20W /  300W |       1MiB /  49140MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

Here's my full deployment.yaml file:

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
    hami.io/gpu-scheduler-policy: ""
    hami.io/node-scheduler-policy: ""
  creationTimestamp: "2024-09-05T08:13:15Z"
  generation: 1
  labels:
    app: nogpus
  name: nogpus
  namespace: 0cd68651-0852-4cfa-9ebc-b2c42f02f746
  resourceVersion: "27240742"
  uid: c1b6d40b-ff10-4b51-b288-5b1d2322dcf4
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 1
  selector:
    matchLabels:
      app: nogpus
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: nogpus
    spec:
      containers:
      - env:
        - name: NOTEBOOK_ARGS
          value: --NotebookApp.token='9cd4990a-e599-4a74-8500-e4d42149738b'
        image: registry.fusionflow.cloud/notebook/pytorch-notebook:cuda12-python-3.11-nvdashboard
        imagePullPolicy: IfNotPresent
        name: nogpus
        ports:
        - containerPort: 8888
          protocol: TCP
        resources:
          limits:
            cpu: "2"
            ephemeral-storage: 200Mi
            memory: 4Gi
            nvidia.com/gpu: "0"
          requests:
            cpu: "2"
            ephemeral-storage: 200Mi
            memory: 4Gi
            nvidia.com/gpu: "0"
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /home/jovyan/work
          name: notebook-data
      dnsPolicy: ClusterFirst
      imagePullSecrets:
      - name: nogpus
      initContainers:
      - command:
        - /bin/chown
        - -R
        - 1000:100
        - /home/jovyan/work
        image: mirror.gcr.io/library/busybox:1.31.1
        imagePullPolicy: IfNotPresent
        name: init-chown-data
        resources: {}
        securityContext:
          privileged: true
          runAsUser: 0
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /home/jovyan/work
          name: notebook-data
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      volumes:
      - name: notebook-data
        persistentVolumeClaim:
          claimName: notebook-data
status:
  availableReplicas: 1
  conditions:
  - lastTransitionTime: "2024-09-05T08:13:27Z"
    lastUpdateTime: "2024-09-05T08:13:27Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  - lastTransitionTime: "2024-09-05T08:13:15Z"
    lastUpdateTime: "2024-09-05T08:13:27Z"
    message: ReplicaSet "nogpus-776f988c55" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
  observedGeneration: 1
  readyReplicas: 1
  replicas: 1
  updatedReplicas: 1

Any pointers? Thank you.

archlitchi commented 2 months ago

yes, you should add env 'NVIDIA_VISIBLE_DEVICES=none' to this container. please refer to this issue https://github.com/Project-HAMi/HAMi/issues/464

sandwichdoge commented 2 months ago

@archlitchi Thanks for the reply. If the pod user overrides this env var, will they still be able to see all the GPUs? I'm working with a low-trust environment where the pod user can only use their own allocated VRAM.

I'm aware there's an option to prevent pod user from overriding env vars:

sudo vi /etc/nvidia-container-runtime/config.toml
# need these lines:
accept-nvidia-visible-devices-as-volume-mounts = true
accept-nvidia-visible-devices-envvar-when-unprivileged = false

However, enabling these lines causes the pods with allocated GPUs to crash with error:

NAME                    READY   STATUS             RESTARTS     AGE
gpus-5bcbc4d55b-zkcsz   0/1     CrashLoopBackOff   1 (2s ago)   5s

kubectl -n 09e5313f-659a-499a-9085-e600df6ea705 logs -f gpus-5bcbc4d55b-zkcsz
Defaulted container "gpus" out of: gpus, init-chown-data (init)
tini: error while loading shared libraries: libcuda.so.1: cannot open shared object file: No such file or director
archlitchi commented 2 months ago

@archlitchi Thanks for the reply. If the pod user overrides this env var, will they still be able to see all the GPUs? I'm working with a low-trust environment where the pod user can only use their own allocated VRAM.

I'm aware there's an option to prevent pod user from overriding env vars:

sudo vi /etc/nvidia-container-runtime/config.toml
# need these lines:
accept-nvidia-visible-devices-as-volume-mounts = true
accept-nvidia-visible-devices-envvar-when-unprivileged = false

However, enabling these lines causes the pods with allocated GPUs to crash with error:

NAME                    READY   STATUS             RESTARTS     AGE
gpus-5bcbc4d55b-zkcsz   0/1     CrashLoopBackOff   1 (2s ago)   5s

kubectl -n 09e5313f-659a-499a-9085-e600df6ea705 logs -f gpus-5bcbc4d55b-zkcsz
Defaulted container "gpus" out of: gpus, init-chown-data (init)
tini: error while loading shared libraries: libcuda.so.1: cannot open shared object file: No such file or director

indeed, if you enabling these lines, device-plugin will not be working properly, because it needs to set 'NVIDIA_VISIBLE_DEVICES' in order to assign GPUs to pods. also, user can directly patch these env into the image, which couldn't be discovered by these line. best-practice is to add a mutating-webhook-configuration for each pod, to add 'NVIDIA_VISIBLE_DEVICES=none' to each contaienr.