Support Container Device Interfaces for Create Container API

containers / podman

Podman: A tool for managing OCI containers and pods.

https://podman.io

Apache License 2.0

23.05k stars 2.35k forks source link

Support Container Device Interfaces for Create Container API #23560

Open suptejas opened 1 month ago

suptejas commented 1 month ago

Feature request description

To use Nvidia's CDI and attach a GPU to a container, I can successfully run the below command

podman run -d --name jupyter-gpu \
  --device nvidia.com/gpu=all \
  --security-opt=label=disable \
  -p 8888:8888 \
  quay.io/dimension/jupyter-server

However, there does not seem to be a clear way to achieve the same through the Podman API. This is critical for companies who would like to orchestrate GPU-backed Podman instances.

Suggest potential solution

Support in the devices field in Podman to correctly parse nvidia.com/gpu=all and other CDI specifications from other providers.

Have you considered any alternatives?

A clear and concise description of any alternative solutions or features you've considered.

Additional context

I have shared more information in my GitHub discussion below:

rhatdan commented 1 month ago

What version of Podman are you using?

podman info

suptejas commented 1 month ago

I'm using podman version 5.1.2

rhatdan commented 1 month ago

What happens with the remote API, it is supposed to work. What error are you seeing?

suptejas commented 1 month ago

I usually get the below error

container create: stat nvidia.com/gpu=all: no such file or directory

Could you give me an example of how to specify devices in the remote API for creating a new container? That way, I can test if it works as expected for me. Perhaps it's a problem with how I'm using it? More details in the below discussion: https://github.com/containers/podman/discussions/23520

suptejas commented 1 month ago

i.e. what I'm essentially requesting for is the equivalent API call for:

podman run -d --name jupyter-gpu \
  --device nvidia.com/gpu=all \
  --security-opt=label=disable \
  -p 8888:8888 \
  quay.io/dimension/jupyter-server

cdrage commented 4 weeks ago

@suptejas @rhatdan I'm running into the same problem too when I was doing preliminary research on how to include this via Podman Desktop. Being able to provide a one-click solution + use the API to enable GPU support.

Going through the internal code, are we just attaching the GPU volumes via container create?

rhatdan commented 4 weeks ago

Just curious but does --gpus=all work? Or is this being done on the client side as well.

rhatdan commented 4 weeks ago

I think I have to have a local GPU to test this.

podman run   --device nvidia.com/gpu=all alpine echo hi
Error: setting up CDI devices: unresolvable CDI devices nvidia.com/gpu=all
$ podman --remote run   --device nvidia.com/gpu=all alpine echo hiError: preparing container 927c637ab1eb12c542122d77a3a3676bc3ce97b47b05c339e2ab134fae76526c for attach: setting up CDI devices: unresolvable CDI devices nvidia.com/gpu=all

$ podman run --gpus=all alpine echo hi
Error: setting up CDI devices: unresolvable CDI devices nvidia.com/gpu=all
$ podman --remote run --gpus=all alpine echo hi
Error: preparing container a45404350f8c03b6fd7800e1b4b78ad9356a564267178dee15fe024ca6b76245 for attach: setting up CDI devices: unresolvable CDI devices nvidia.com/gpu=all