containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.84k stars 2.42k forks source link

Device requests are not respected via API #22645

Open jennydaman opened 6 months ago

jennydaman commented 6 months ago

Issue Description

"Device requests" are how GPUs are invoked from the Docker API. However, device requests are not being respected by Podman when creating a container over the Podman socket.

Steps to reproduce the issue

Here is a Python script which tests the Docker and Podman socket APIs.

Setup: install Python version 3.10+ and run pip install docker==7.0.0

Run these tests:

import subprocess as sp
import docker
import docker.types

# Setup: create unix socket clients
# --------------------------------------------------------------------------------

podman_socket = sp.check_output(['podman', 'info', '--format', '{{ .Host.RemoteSocket.Path }}'], text=True).strip()
podman_client = docker.DockerClient(base_url=f'unix://{podman_socket}')
docker_client = docker.DockerClient(base_url=f'unix:///var/run/docker.sock')

# Sanity checks: assert podman is working
# --------------------------------------------------------------------------------

assert b'!... Hello Podman World ...!' in podman_client.containers.run('quay.io/podman/hello', auto_remove=True)

# Sanity checks: assert podman and docker both work with nvidia-container-toolkit
# --------------------------------------------------------------------------------

def test_nvidia_smi_works_using_command(command: str):
    assert sp.check_output([command, 'run', '--rm', '--gpus=all', 'registry.access.redhat.com/ubi9:9.4-947.1714667021', 'nvidia-smi', '-L']).startswith(b'GPU 0')

test_nvidia_smi_works_using_command('docker')
test_nvidia_smi_works_using_command('podman')

# Bug reproduction cases
# --------------------------------------------------------------------------------

GPU_REQUEST = {
    'device_requests': [ docker.types.DeviceRequest(count=1, capabilities=[['gpu']]) ]
}

def test_nvidia_smi_works_using_client(client: docker.DockerClient):
    assert client.containers.run('registry.access.redhat.com/ubi9:9.4-947.1714667021', ['nvidia-smi', '-L'], **GPU_REQUEST).startswith(b'GPU 0')

test_nvidia_smi_works_using_client(docker_client)  # pass
test_nvidia_smi_works_using_client(podman_client)  # fail

def test_device_request_goes_through(client: docker.DockerClient):
    container = client.containers.run('registry.access.redhat.com/ubi9:9.4-947.1714667021', ['nvidia-smi', '-L'], detach=True, **GPU_REQUEST)
    assert len(container.attrs['HostConfig']['DeviceRequests']) > 0
    assert any(request.get('Capabilities', None) == ['gpu'] for request in container.attrs['HostConfig']['DeviceRequests'])

test_device_request_goes_through(docker_client)  # pass
test_device_request_goes_through(podman_client)  # fail

Describe the results you received

Describe the results you expected

podman info output

host:
  arch: amd64
  buildahVersion: 1.35.3
  cgroupControllers:
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: /usr/bin/conmon is owned by conmon 1:2.1.11-1
    path: /usr/bin/conmon
    version: 'conmon version 2.1.10, commit: e21e7c85b7637e622f21c57675bf1154fc8b1866'
  cpuUtilization:
    idlePercent: 94.1
    systemPercent: 1.54
    userPercent: 4.36
  cpus: 20
  databaseBackend: boltdb
  distribution:
    distribution: arch
    version: unknown
  eventLogger: journald
  freeLocks: 2012
  hostname: geo
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 6.8.9-arch1-1
  linkmode: dynamic
  logDriver: journald
  memFree: 97578004480
  memTotal: 134802944000
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: /usr/lib/podman/aardvark-dns is owned by aardvark-dns 1.10.0-2
      path: /usr/lib/podman/aardvark-dns
      version: aardvark-dns 1.10.0
    package: /usr/lib/podman/netavark is owned by netavark 1.10.3-1
    path: /usr/lib/podman/netavark
    version: netavark 1.10.3
  ociRuntime:
    name: crun
    package: /usr/bin/crun is owned by crun 1.15-1
    path: /usr/bin/crun
    version: |-
      crun version 1.15
      commit: e6eacaf4034e84185fd8780ac9262bbf57082278
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: /usr/bin/pasta is owned by passt 2024_04_26.d03c4e2-1
    version: |
      pasta 2024_04_26.d03c4e2
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /etc/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: /usr/bin/slirp4netns is owned by slirp4netns 1.3.0-1
    version: |-
      slirp4netns version 1.3.0
      commit: 8a4d4391842f00b9c940bb8f067964427eb0c964
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.5
  swapFree: 0
  swapTotal: 0
  uptime: 1h 31m 25.00s (Approximately 0.04 days)
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries: {}
store:
  configFile: /home/jenni/.config/containers/storage.conf
  containerStore:
    number: 20
    paused: 0
    running: 14
    stopped: 6
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/jenni/.local/share/containers/storage
  graphRootAllocated: 1578640605184
  graphRootUsed: 1019202039808
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 56
  runRoot: /run/user/1000/containers
  transientStore: false
  volumePath: /home/jenni/.local/share/containers/storage/volumes
version:
  APIVersion: 5.0.2
  Built: 1713438799
  BuiltTime: Thu Apr 18 07:13:19 2024
  GitCommit: 3304dd95b8978a8346b96b7d43134990609b3b29-dirty
  GoVersion: go1.22.2
  Os: linux
  OsArch: linux/amd64
  Version: 5.0.2

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

$ nvidia-container-cli info
NVRM version:   550.78
CUDA version:   12.4

Device Index:   0
Device Minor:   0
Model:          NVIDIA GeForce RTX 3080 Ti
Brand:          GeForce
GPU UUID:       GPU-c61acb21-8716-6540-271c-39beab917d03
Bus Location:   00000000:01:00.0
Architecture:   8.6

Additional information

No response

mheon commented 6 months ago

Any chance you can get us the JSON being sent by the container create request (the first podman_client.containers.run)? I don't have an nvidia card and as such can't use CDI to try and reproduce.

jennydaman commented 6 months ago

Sure.

the python code

client.containers.run('registry.access.redhat.com/ubi9:9.4-947.1714667021', ['nvidia-smi', '-L'], device_requests=[ docker.types.DeviceRequest(count=1, capabilities=[['gpu']]) ])

sends this JSON to the socket:

{
  "Hostname": null,
  "Domainname": null,
  "ExposedPorts": null,
  "User": null,
  "Tty": false,
  "OpenStdin": false,
  "StdinOnce": false,
  "AttachStdin": false,
  "AttachStdout": true,
  "AttachStderr": true,
  "Env": null,
  "Cmd": [
    "nvidia-smi",
    "-L"
  ],
  "Image": "registry.access.redhat.com/ubi9:9.4-947.1714667021",
  "Volumes": null,
  "NetworkDisabled": false,
  "Entrypoint": null,
  "WorkingDir": null,
  "HostConfig": {
    "NetworkMode": "default",
    "DeviceRequests": [
      {
        "Driver": "",
        "Count": 1,
        "DeviceIDs": [],
        "Capabilities": [
          [
            "gpu"
          ]
        ],
        "Options": {}
      }
    ]
  },
  "NetworkingConfig": null,
  "MacAddress": null,
  "Labels": null,
  "StopSignal": null,
  "Healthcheck": null,
  "StopTimeout": null,
  "Runtime": null
}
rrbanda commented 6 months ago

@jennydaman does the following help ? https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#configuring-podman , basically using nvidia-container-toolkit

jennydaman commented 6 months ago

@rrbanda that is unrelated to this issue.

Podman implements the Docker API in an attempt to be compatible with Docker. This issue is about the DeviceRequests field of Docker's API, which is different from CDI (container device interface).

seb-835 commented 5 months ago

Hi, thanks @jennydaman for this thread, i got the same behaviour when trying to manage podman through the Docker API. And i do not see any piece of podman code handling the DeviceRequests field from docker API.

@mheon is there something we can do to help ?