containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.62k stars 2.41k forks source link

macos/9pfs: funky recursive readdir() results on mounted volumes #21097

Closed lauri-paypay closed 7 months ago

lauri-paypay commented 10 months ago

Issue Description

Several commands hang when recursing into directory structures mounted from macOS into podman containers. The problem does not occur in the container filesystem; only within mounted volumes.

It appears to be related to opendir()/readdir() returning strange results; it's triggerable using busybox find(1) on alpine (musl), for example: see below.

I'm not at all sure if the problem is in podman, the CoreOS kernel, musl in alpine, or what, but thus far I've only been able to repro it with podman on macos, so I'm reporting it here.

Steps to reproduce the issue

Steps to reproduce the issue

  1. podman machine start && cd $HOME && mkdir foo foo/bar foo/bar/baz
  2. podman run --rm -it -v $HOME/foo:/tmp alpine:latest find /tmp

The problem does not occur if foo/bar/baz does not exist, or is not a directory; it appears two levels of directories are required to trigger it.

Describe the results you received

/tmp
/tmp/bar
/tmp/bar/baz
/tmp/bar
/tmp/bar/baz
/tmp/bar
/tmp/bar/baz
/tmp/bar
/tmp/bar/baz
/tmp/bar
[... continues until killed]

Describe the results you expected

/tmp
/tmp/bar
/tmp/bar/baz

podman info output

host:
  arch: arm64
  buildahVersion: 1.32.0
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.8-2.fc39.aarch64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.8, commit: '
  cpuUtilization:
    idlePercent: 93.69
    systemPercent: 2.48
    userPercent: 3.83
  cpus: 4
  databaseBackend: boltdb
  distribution:
    distribution: fedora
    variant: coreos
    version: "39"
  eventLogger: journald
  freeLocks: 2048
  hostname: localhost.localdomain
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
    uidmap:
    - container_id: 0
      host_id: 501
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
  kernel: 6.6.3-200.fc39.aarch64
  linkmode: dynamic
  logDriver: journald
  memFree: 541609984
  memTotal: 1979428864
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.8.0-1.fc39.aarch64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.8.0
    package: netavark-1.8.0-2.fc39.aarch64
    path: /usr/libexec/podman/netavark
    version: netavark 1.8.0
  ociRuntime:
    name: crun
    package: crun-1.12-1.fc39.aarch64
    path: /usr/bin/crun
    version: |-
      crun version 1.12
      commit: ce429cb2e277d001c2179df1ac66a470f00802ae
      rundir: /run/user/501/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20231119.g4f1709d-1.fc39.aarch64
    version: |
      pasta 0^20231119.g4f1709d-1.fc39.aarch64-pasta
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: true
    path: /run/user/501/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.2-1.fc39.aarch64
    version: |-
      slirp4netns version 1.2.2
      commit: 0ee2d87523e906518d34a6b423271e4826f71faf
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 0
  swapTotal: 0
  uptime: 0h 4m 38.00s
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /var/home/core/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/home/core/.local/share/containers/storage
  graphRootAllocated: 106769133568
  graphRootUsed: 2947833856
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 0
  runRoot: /run/user/501/containers
  transientStore: false
  volumePath: /var/home/core/.local/share/containers/storage/volumes
version:
  APIVersion: 4.7.2
  Built: 1698762633
  BuiltTime: Tue Oct 31 23:30:33 2023
  GitCommit: ""
  GoVersion: go1.21.1
  Os: linux
  OsArch: linux/arm64
  Version: 4.7.2

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

No

Additional environment details

I'm not sure how to upgrade podman from 4.7.2 in the coreos image; 4.8.2 does not appear to be available.

Additional information

The following program demonstrates the strange results from readdir(), modelled after the busybox find(1) implementation:

#include <sys/stat.h>
#include <err.h>
#include <dirent.h>
#include <stdio.h>
#include <string.h>

char *root;

void
recurse(char *fn)
{
    struct stat sb;
    if (lstat(fn, &sb) == -1)
        err(1, "lstat %s", fn);
    if (!S_ISDIR(sb.st_mode)) {
        printf("%s\n", fn);
        return;
    }
    printf("%s/\n", fn);
    DIR *d = opendir(fn);
    if (!d)
        err(1, "opendir %s", fn);
    struct dirent *next;
    while ((next = readdir(d)) != NULL) {
        char *nextfn;
        if (strcmp(fn, root) == 0) {
            printf("readdir(%s) next: %s\n", root, next->d_name);
        }
        if (strcmp(next->d_name, ".") == 0)
            continue;
        if (strcmp(next->d_name, "..") == 0)
            continue;
        if (asprintf(&nextfn, "%s/%s", fn, next->d_name) == -1)
            err(1, "asprintf %s/%s", fn, next->d_name);
        recurse(nextfn);
    }
    closedir(d);
}

int
main(int argc, char **argv)
{
    if (argc != 2)
        errx(1, "usage");
    root = argv[1];
    recurse(root);
}

when run in the alpine:latest container with the mounts from the repro steps, outputs:

/ # ./a.out /tmp
/tmp/
readdir(/tmp) next: .
readdir(/tmp) next: ..
readdir(/tmp) next: bar
/tmp/bar/
/tmp/bar/baz/
readdir(/tmp) next: .
readdir(/tmp) next: ..
readdir(/tmp) next: bar
/tmp/bar/
/tmp/bar/baz/
readdir(/tmp) next: .
readdir(/tmp) next: ..
readdir(/tmp) next: bar
[... ad infinitum]

i.e., it appears that after recursing into /tmp/bar and returning from that recurse, readdir() returns the results from the start all over again even though opendir() on it was only called once, at the beginning.

rhatdan commented 10 months ago

The default file system for QEMU version of Podman machine is plan9. When we switch to using applehv, it will be virtiofsd. It would be interesting to see if the problem disappears with the switch.

afbjorklund commented 10 months ago

I'm not at all sure if the problem is in podman, the CoreOS kernel, musl in alpine, or what,

Seems to be specific to virtfs-on-darwin, does not reproduce on Linux.

https://github.com/containers/podman/releases/download/v4.7.2/podman-remote-static-linux_amd64.tar.gz

$ podman-remote-static --connection podman-machine-default run --rm -it -v $HOME/foo:/tmp alpine:latest find /tmp
/tmp
/tmp/bar
/tmp/bar/baz

So probably: none of the above, but most likely in the 9p server (or corner case in 9p client)

lauri-paypay commented 10 months ago

the problem doesn't occur when running my reproducer directly on CoreOS via podman machine ssh though, so maybe it's virtfs-on-darwin but also musl specific, or maybe the container mounts change something.

github-actions[bot] commented 8 months ago

A friendly reminder that this issue had no activity for 30 days.

Luap99 commented 7 months ago

podman 5.0 uses virtiofs with apple hypervisor so I suggest you retry with that, either way this is doe snot seem to be bug in podman

lauri-paypay commented 6 months ago

podman 5.0 uses virtiofs with apple hypervisor so I suggest you retry with that, either way this is doe snot seem to be bug in podman

thanks. I can't reproduce the issue with 5.0, so whichever component it was in, this issue can be closed.