containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.55k stars 2.39k forks source link

Podman doesn't update device major/minor number on machine reboot #21255

Open psaini79 opened 9 months ago

psaini79 commented 9 months ago

Issue Description

We are testing our application on Podman which requires devices to be assigned to Podman container. We assign disks using --device and Podman records the major and minor number and resolve the symlink to the host device as per the documentation: https://docs.podman.io/en/v2.2.1/markdown/podman-build.1.html.

However, on reboot of the machine the disk major and minor number changes and Podman does update the previously stored information of major and minor number. Because of this situation, our application kept on failing as Podman doesn't have correct disk allocation on machine reboot.

How to handle this situation?

Steps to reproduce the issue

Steps to reproduce the issue

  1. Create container:
    podman run -d --hostname apptest --device /dev/disk/by-partlabel/podman_disk01:/dev/app_disk1 --volume /mnt:/mnt --workdir /mnt  --cap-add=SYS_NICE --cap-add=NET_ADMIn --cap-add=SYS_RESOURCE --name apptest container-registry.oracle.com/os/oraclelinux:8 "/usr/sbin/init"
  2. Check the mapping device /dev/disk/by-partlabel/podman_disk01 on Podman host:
    [root@oem-targets opc]# date;lsblk | grep sdi
    Sun Jan 14 11:52:47 GMT 2024
    sdi                  8:128  0   596G  0 disk
    ├─sdi1               8:129  0  93.1G  0 part
    ├─sdi2               8:130  0  93.1G  0 part
    ├─sdi3               8:131  0  93.1G  0 part
    ├─sdi4               8:132  0  93.1G  0 part
    ├─sdi5               8:133  0  93.1G  0 part
    └─sdi6               8:134  0  93.1G  0 part
    [root@oem-targets opc]#
  3. Check inside the apptest container:
    [root@apptest mnt]# ls -ltr /dev/app_disk1
    brw-rw---- 1 root disk 8, 129 Jan 14 11:52 /dev/app_disk1
    [root@apptest mnt]#
  4. Reboot the worker node:
  5. Check the disks allocation on Podman host:
    
    [root@oem-targets ~]# date;lsblk | grep sdc
    Sun Jan 14 12:01:32 GMT 2024
    sdc                  8:32   0   596G  0 disk
    ├─sdc1               8:33   0  93.1G  0 part
    ├─sdc2               8:34   0  93.1G  0 part
    ├─sdc3               8:35   0  93.1G  0 part
    ├─sdc4               8:36   0  93.1G  0 part
    ├─sdc5               8:37   0  93.1G  0 part
    └─sdc6               8:38   0  93.1G  0 part

[root@oem-targets ~]# date;ls -ltr /dev/disk/by-partlabel/podman_disk01 Sun Jan 14 12:02:08 GMT 2024 lrwxrwxrwx 1 root root 10 Jan 14 12:01 /dev/disk/by-partlabel/podman_disk01 -> ../../sdc1 [root@oem-targets ~]#

7. Check the mapping on Podman host:

[root@oem-targets ~]# podman start apptest apptest

[root@oem-targets ~]# podman exec -i -t apptest /bin/bash [root@apptest mnt]# [root@apptest mnt]# [root@apptest mnt]# ls -ltr /dev/app_disk1 brw-rw---- 1 root disk 8, 129 Jan 14 12:03 /dev/app_disk1 [root@apptest mnt]#


### Describe the results you received

Describe the results you received

### Describe the results you expected

Podman must update major and minor number on reboot.

### podman info output

```yaml
[root@oem-targets userdata]# podman info
host:
  arch: amd64
  buildahVersion: 1.29.0
  cgroupControllers:
  - cpuset
  - cpu
  - cpuacct
  - blkio
  - memory
  - devices
  - freezer
  - net_cls
  - perf_event
  - net_prio
  - hugetlb
  - pids
  - rdma
  - misc
  cgroupManager: systemd
  cgroupVersion: v1
  conmon:
    package: conmon-2.1.6-1.module+el8.8.0+21045+adcb6a64.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.6, commit: 31a72124adb6095b6be85b27e3e481313a1cea96'
  cpuUtilization:
    idlePercent: 99.78
    systemPercent: 0.11
    userPercent: 0.11
  cpus: 24
  distribution:
    distribution: '"ol"'
    variant: server
    version: "8.8"
  eventLogger: file
  hostname: oem-targets
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.15.0-104.119.4.2.el8uek.x86_64
  linkmode: dynamic
  logDriver: k8s-file
  memFree: 199399735296
  memTotal: 202051198976
  networkBackend: cni
  ociRuntime:
    name: runc
    package: runc-1.1.4-1.0.1.module+el8.8.0+21119+51f68ed8.x86_64
    path: /usr/bin/runc
    version: |-
      runc version 1.1.4
      spec: 1.0.2-dev
      go: go1.19.10
      libseccomp: 2.5.2
  os: linux
  remoteSocket:
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_SYS_CHROOT,CAP_NET_RAW,CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /bin/slirp4netns
    package: slirp4netns-1.2.0-2.module+el8.8.0+21045+adcb6a64.x86_64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.4.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.2
  swapFree: 8207200256
  swapTotal: 8207200256
  uptime: 0h 16m 35.00s
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - container-registry.oracle.com
  - docker.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 2
    paused: 0
    running: 1
    stopped: 1
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphRootAllocated: 160955367424
  graphRootUsed: 23958343680
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "true"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 70
  runRoot: /run/containers/storage
  transientStore: false
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 4.4.1
  Built: 1697721251
  BuiltTime: Thu Oct 19 13:14:11 2023
  GitCommit: ""
  GoVersion: go1.19.13
  Os: linux
  OsArch: linux/amd64
  Version: 4.4.1

Podman in a container

Yes

Privileged Or Rootless

Privileged

Upstream Latest Release

Yes

Additional environment details

Additional environment details

Additional information

Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting

mheon commented 9 months ago

Hm. We bake the major/minor number into the container at creation time, so there's no real way to update if it changes after reboot. That doesn't match what Docker does, so our current implementation is definitely incorrect.

As a workaround, you can bind-mount a folder containing the device in question into the container. That will track any changes to the device node.

mheon commented 9 months ago

Solution: we'll have to stop resolving devices on container creation and instead add a field to ContainerConfig and stick them there, then resolve them during OCI spec generation. This will only affect new containers (old ones will still use the baked-in OCI spec devices) but will handle things correctly in the future.

rhatdan commented 9 months ago

SGTM

psaini79 commented 9 months ago

@mheon yes, it make sense to have a flag on this.

github-actions[bot] commented 8 months ago

A friendly reminder that this issue had no activity for 30 days.

sushmbha commented 7 months ago

Hi, can we get a fix for this issue ? I think the issue here is that podman is storing the device names like /dev/sdc in the container config instead of the persistent device names. These device names can change after reboot, hence the container config should actually store the persistent device names and not the ephemeral ones which can change depending on kernel discovering devices in a certain order.

rhatdan commented 7 months ago

Care to open a PR to fix?