containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.27k stars 2.37k forks source link

podman should use graphRoot/tmp for temporary image storage #11107

Closed akostadinov closed 3 years ago

akostadinov commented 3 years ago

/kind bug

Description

I mounted a new volume at graphRoot but pulling images still eats up storage under /var/tmp. I think it makes much more sense to use $graphRoot/tmp for temporary storage instead. That directory is already automatically created. I see no reason that user would expect storage outside graphRoot to be used.

Steps to reproduce the issue:

  1. mount volume /home/fedora/.local/share/containers/storage
  2. podman run large_image
  3. df -h

Describe the results you received:

While pulling I see space on / exhausted and pull fails.

Error: Error writing blob: error storing blob to file "/var/tmp/storage280617948/6": write /var/tmp/storage280617948/6: no space left on device

Describe the results you expected:

/dev/vda1       4.9G  4.3G  0.4G  27% /
/dev/vdb1        20G  5.1G  15G   1% /home/fedora/.local/share/containers/storage

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

host:
  arch: amd64
  buildahVersion: 1.21.3
  cgroupControllers: []
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.0.29-2.fc34.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.29, commit: '
  cpus: 4
  distribution:
    distribution: fedora
    version: "34"
  eventLogger: journald
  hostname: cloudimg
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.11.12-300.fc34.x86_64
  linkmode: dynamic
  memFree: 4385394688
  memTotal: 8188215296
  ociRuntime:
    name: crun
    package: crun-0.20.1-1.fc34.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.20.1
      commit: 0d42f1109fd73548f44b01b3e84d04a279e99d2e
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.9-1.fc34.x86_64
    version: |-
      slirp4netns version 1.1.8+dev
      commit: 6dc0186e020232ae1a6fcc1f7afbc3ea02fd3876
      libslirp: 4.4.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.0
  swapFree: 0
  swapTotal: 0
  uptime: 20m 24.3s
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /home/fedora/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-1.5.0-1.fc34.x86_64
      Version: |-
        fusermount3 version: 3.10.4
        fuse-overlayfs: version 1.5
        FUSE library version 3.10.4
        using FUSE kernel interface version 7.31
  graphRoot: /home/fedora/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 0
  runRoot: /run/user/1000/containers
  volumePath: /home/fedora/.local/share/containers/storage/volumes
version:
  APIVersion: 3.2.3
  Built: 1626467612
  BuiltTime: Fri Jul 16 20:33:32 2021
  GitCommit: ""
  GoVersion: go1.16.5
  OsArch: linux/amd64
  Version: 3.2.3

Output of podman info --debug:

  arch: amd64
  buildahVersion: 1.21.3
  cgroupControllers: []
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.0.29-2.fc34.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.29, commit: '
  cpus: 4
  distribution:
    distribution: fedora
    version: "34"
  eventLogger: journald
  hostname: cloudimg
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.11.12-300.fc34.x86_64
  linkmode: dynamic
  memFree: 7531401216
  memTotal: 8188215296
  ociRuntime:
    name: crun
    package: crun-0.20.1-1.fc34.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.20.1
      commit: 0d42f1109fd73548f44b01b3e84d04a279e99d2e
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.9-1.fc34.x86_64
    version: |-
      slirp4netns version 1.1.8+dev
      commit: 6dc0186e020232ae1a6fcc1f7afbc3ea02fd3876
      libslirp: 4.4.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.0
  swapFree: 0
  swapTotal: 0
  uptime: 23m 1.07s
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /home/fedora/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-1.5.0-1.fc34.x86_64
      Version: |-
        fusermount3 version: 3.10.4
        fuse-overlayfs: version 1.5
        FUSE library version 3.10.4
        using FUSE kernel interface version 7.31
  graphRoot: /home/fedora/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 0
  runRoot: /run/user/1000/containers
  volumePath: /home/fedora/.local/share/containers/storage/volumes
version:
  APIVersion: 3.2.3
  Built: 1626467612
  BuiltTime: Fri Jul 16 20:33:32 2021
  GitCommit: ""
  GoVersion: go1.16.5
  OsArch: linux/amd64
  Version: 3.2.3

Package info (e.g. output of rpm -q podman or apt list podman):

podman-3.2.3-1.fc34.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/master/troubleshooting.md)

No

Additional environment details (AWS, VirtualBox, physical, etc.):

KVM, VirtualMachineManager

rhatdan commented 3 years ago

If you point TMPDIR environment variable at that location, does everything work ok?

rhatdan commented 3 years ago

One issue with pointing TMPDIR at /var/tmp is we get guaranteed cleanup from systemd at some point. Pointing at our own tmp means we need to pay attention to it. (Hint we don't).

akostadinov commented 3 years ago

TMPDIR works. But then you have to have storage space on root filesystem as well on specially mounted container filesystem. For example oracle image is around 4GB and it can't fit standard fedora qcow2 image. Also if TMPDIR is used then still user has to deal with clean-up.

rhatdan commented 3 years ago

@mtrmac @mheon @vrothberg @nalind WDYT?

mtrmac commented 3 years ago

TMPDIR, as well as its default value, is now documented in man pages; so that’s a fairly good reason why users should expect it to be used.

I don’t care much at all what the default should be, but I’m not much of a fan of changing the default now that it exists; this can easily break someone’s carefully-tuned quotas / carefully-tested resiliency of existing workloads vs. pulling unexpectedly-large images.

It might be worth considering if there were other benefits (e.g. maybe integrating this with c/storage’s cleanup of orphaned partial layers — OTOH I’m not sure that would be even possible), or if operational experience strongly suggested that sharing a quota is preferable, but we would have to spend quite some effort to communicate the change (or perhaps even wait for a major version break).

rhatdan commented 3 years ago

One thing we could to is add it to containers.conf, so that we could setup something like

# Default location for storing temporary container image content,  Can be overriden with the TMPDIR environment 
# variable.  If you specify "storage", then the location of the container/storage tmp directory will be used.
# storage_tmp_dir="storage"
storage_tmp_dir="/var/tmp"

And then allow users to override this field. That way we ensure backwords compatibility.

akostadinov commented 3 years ago

It will be a good start. And also point at this setting from wherever graphRoot is documented. Honestly I can't find where graphRoot is officially documented.

Funnily Duckduckgo only shows me oracle docs which do not mention TMPDIR.

And man podman pull does not show anything about graphRoot.

btw I find it a very remote possibility that somebody relies on TMPDIR default being outside graphRoot. If anybody changed it, it would most likely be because they didn't want to burden their root filesystem with container data. If that temporary storage used the regular data storage location, and it didn't leave orphan tmp data, then it's very unlikely IMO that somebody would care.

On the other hand one has to lose a lot of time figuring out why some data is here, some data not and where exactly data goes and why (at least it took me a good amount of time as well after 3 months I probably would forget the details and could be bitten by that again). My observation on docker is that it keeps everything under one storage directory unless I'm missing something.

github-actions[bot] commented 3 years ago

A friendly reminder that this issue had no activity for 30 days.

mtrmac commented 3 years ago

And man podman pull does not show anything about graphRoot.

That’s what I would expect right now, when it does not use graphRoot for temporary files.

btw I find it a very remote possibility that somebody relies on TMPDIR default being outside graphRoot.

The documentation does talk about TMPDIR though, see https://github.com/containers/podman/blob/main/docs/source/markdown/podman-pull.1.md . And that came from https://github.com/containers/podman/pull/5412 , which points to actual users’ report of needing that setting. So there’s at least one person that relies on that.

akostadinov commented 3 years ago

Users would need to use TMPDIR, no wonder. I'm not suggesting to remove this ability.

I'm suggesting to have a better and expected default. At least make this configurable (as Dan suggested) in configuration file so that this environment variable does not need to be set.