containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.62k stars 2.41k forks source link

podman run fails if `--tty` or `-t` is used and `/tmp` is mapped to the podman machine #18230

Open chevdor opened 1 year ago

chevdor commented 1 year ago

Issue Description

After reporting this issue, I tested with a default podman machine. In that case, the issue described below does NOT occur.

I do run into the issue when using a freshly created machine close to the default machine but with an extra mapping to /tmp:

podman machine init \
  -v /Users:/Users \                        # <--- default
  -v /private:/private \                    # <--- default
  -v /var/folders:/var/folders \            # <--- default
  -v /tmp:/tmp                              # <--- non default and breaking

Mapping /private/tmp does work fine though.

what works

$ podman run ubuntu echo "Hello"

Resolved "ubuntu" as an alias (/etc/containers/registries.conf.d/000-shortnames.conf)
Trying to pull docker.io/library/ubuntu:latest...
Getting image source signatures
Copying blob sha256:2ab09b027e7f3a0c2e8bb1944ac46de38cebab7145f0bd6effebfe5492c818b6
Copying config sha256:08d22c0ceb150ddeb2237c5fa3129c0183f3cc6f5eeb2e7aa4016da3ad02140a
Writing manifest to image destination
Storing signatures
Hello

what does not work

$ podman run -t ubuntu echo "Hello"

Error: preparing container 873a0d7c4f7137d12b4ec1a3dc51c9021769e444e944bef8b542af3a2050d6da for attach: container create failed (no logs from conmon): conmon bytes "": readObjectStart: expect { or n, but found , error found in #0 byte of ...||..., bigger context ...||...

Steps to reproduce the issue

podman run -t ubuntu echo "Hello"

or

podman run -it ubuntu

Describe the results you received

See description

Describe the results you expected

Using --tty or -t works without error.

podman info output

host:
  arch: amd64
  buildahVersion: 1.29.0
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.7-2.fc37.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.7, commit: '
  cpuUtilization:
    idlePercent: 99.31
    systemPercent: 0.44
    userPercent: 0.25
  cpus: 4
  distribution:
    distribution: fedora
    variant: coreos
    version: "37"
  eventLogger: journald
  hostname: localhost.localdomain
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
    uidmap:
    - container_id: 0
      host_id: 501
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
  kernel: 6.2.8-200.fc37.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 32253423616
  memTotal: 32849166336
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.8.3-2.fc37.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.8.3
      commit: 59f2beb7efb0d35611d5818fd0311883676f6f7e
      rundir: /run/user/501/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/501/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-8.fc37.x86_64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 0
  swapTotal: 0
  uptime: 0h 11m 38.00s
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /var/home/core/.config/containers/storage.conf
  containerStore:
    number: 3
    paused: 0
    running: 0
    stopped: 3
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/home/core/.local/share/containers/storage
  graphRootAllocated: 106769133568
  graphRootUsed: 2748747776
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 2
  runRoot: /run/user/501/containers
  transientStore: false
  volumePath: /var/home/core/.local/share/containers/storage/volumes
version:
  APIVersion: 4.4.2
  Built: 1677669779
  BuiltTime: Wed Mar  1 12:22:59 2023
  GitCommit: ""
  GoVersion: go1.19.6
  Os: linux
  OsArch: linux/amd64
  Version: 4.4.2

Podman in a container

No

Privileged Or Rootless

Tried both, appears not relevant.

Upstream Latest Release

Yes

Additional environment details

Additional environment details

Additional information

Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting

chevdor commented 1 year ago

I am testing further and the issue does not seem to be related to my funky folders but to:

-v /tmp:/tmp
chevdor commented 1 year ago

I edited the description to remove the mapping that are not relevant.

Luap99 commented 1 year ago

When you trigger this error please run podman machine ssh journalctl -r and then look for errors reported by conmon or podman. This hopefully shows us some useful error message.

chevdor commented 1 year ago

I did not see anything obvious but here is a dump: https://gist.github.com/chevdor/48913984195ec6962719c22765dd1b2f

Luap99 commented 1 year ago

Yeah doesn't show anything useful to me.

@mheon Any idea how --tty could related to the /tmp mount? From the error it looks like conmon is just segfaulting?

chevdor commented 1 year ago

Could it be that tty needs /tmp in the podman machine for some reason and putting the user's host /tmp on top of it does not end well.

I am personally using /tmp a lot because it is "self organising" :) and also short to type.

If that's the case, an option would be to mount the machine /tmp as something else such as /temp and free /tmp for the user. I did not check how the machineis defined but its /tmp is probably using tmpfs and that may cause issue when binding it to non-tmpfs from the host.

Arguably the user could also use /temp but this is counter-intuitive and the machine will be easier to educate than all the users.

mheon commented 1 year ago

If I had to guess, it would be related to logging - trying a container with --log-driver=none will probably confirm that.

chevdor commented 1 year ago

Let me try. I create a new machine now that I have one that works :)

podman machine init broken -v /Users:/Users -v /private:/private -v /var/folders:/var/folders -v /tmp:/tmp                              

Arf. Hitting another bug, possibly in podman-desktop. The new machine somehow overwrote my working one...

I can reproduce the issue with podman run -t ubuntu echo "Hello". The following fails the same way: podman run -t --log-driver=none ubuntu echo "Hello"

chevdor commented 1 year ago

Could that bring ideas? It is the /tmp of the machine with not bound the host:

podman machine ssh broken
Connecting to vm broken. To close connection, use `~.` or `exit`
Fedora CoreOS 38.20230414.2.0
Tracker: https://github.com/coreos/fedora-coreos-tracker
Discuss: https://discussion.fedoraproject.org/tag/coreos

[core@localhost ~]$ cd /tmp
[core@localhost tmp]$ ll
total 0
drwx------. 3 root root 60 Apr 18 20:02 systemd-private-340e57b45f3a4b598317dc471638eb5a-chronyd.service-nBFmu3
drwx------. 3 root root 60 Apr 18 20:02 systemd-private-340e57b45f3a4b598317dc471638eb5a-dbus-broker.service-G4f7xv
drwx------. 3 root root 60 Apr 18 20:02 systemd-private-340e57b45f3a4b598317dc471638eb5a-rpm-ostreed.service-oMp8ZI
drwx------. 3 root root 60 Apr 18 20:02 systemd-private-340e57b45f3a4b598317dc471638eb5a-systemd-hostnamed.service-S8X5v8
drwx------. 3 root root 60 Apr 18 20:02 systemd-private-340e57b45f3a4b598317dc471638eb5a-systemd-logind.service-fAcYet
drwx------. 3 root root 60 Apr 18 20:02 systemd-private-340e57b45f3a4b598317dc471638eb5a-systemd-resolved.service-Qq7bQz
chevdor commented 1 year ago

Now I can confirm that /tmp is a tmpfs as expected:

[core@localhost tmp]$ df
Filesystem      1K-blocks       Used Available Use% Mounted on
devtmpfs             4096          0      4096   0% /dev
tmpfs             1000404         84   1000320   1% /dev/shm
tmpfs              400164       5668    394496   2% /run
/dev/vda4       104266732    2280344 101986388   3% /sysroot
overlay         104266732    2280344 101986388   3% /usr
tmpfs             1000404          0   1000404   0% /tmp                  <----------
/dev/vda3          358271     103884    230631  32% /boot
tmpfs              200080          4    200076   1% /run/user/501
vol0           1953902844 1671874764 282028080  86% /Users
vol1           1953902844 1671874764 282028080  86% /private
vol2           1953902844 1671874764 282028080  86% /var/folders
chevdor commented 1 year ago

I suggest NOT using podman-desktop for those tests for now due to this issue.

Luap99 commented 1 year ago

If I had to guess, it would be related to logging - trying a container with --log-driver=none will probably confirm that.

As the reporter shows it only fails with --tty logging should be the same regardless of --tty set or not.

Following the code in conmon I found this: https://github.com/containers/conmon/blob/08c34bda8c75a37f153dfbd63399d22050551053/src/conn_sock.c#L170-L191

get_tmp_dir defaults to /tmp so conmon tries to create to create a temporary socket under /tmp. https://docs.gtk.org/glib/func.get_tmp_dir.html

Does 9p filesystem mount support sockets? I think the socket call is failing but then I still do not understand why there is no error message from conmon in the journal.

mtrmac commented 1 year ago

Mapping /private/tmp does work fine though.

Are you using -v /private/tmp:/tmp or -v /private/tmp:/private/tmp, then?

The latter probably indicates nothing either way.

if the former works, and the two behave differently, that would suggest that we are mapping just the symlink over 9pfs. And in that case we are creating a socket inside the VM at /private/tmp/…, and I can well imagine permissions, or SELinux, not being happy with that.


Does 9p filesystem mount support sockets?

https://github.com/torvalds/linux/blob/2d1bcbc6cd703e64caf8df314e3669b4786e008a/fs/9p/vfs_inode.c#L54-L55 suggests that it can, depending on options (and server support?).

chevdor commented 1 year ago

Are you using -v /private/tmp:/tmp or -v /private/tmp:/private/tmp, then?

I use -v /private/tmp:/private/tmp.

Your question brings a nice idea that would solve one other of my problems. I would love being able to use:

podman run --rm -it -v /tmp/mysite:/var/www/bla ningx

instead of my current:

podman run --rm -it -v /private-tmp/mysite:/var/www/bla ningx

But the problem described in this issue remains also when mapping -v /private/tmp:/tmp since the issue is that the /tmp of the Podman machine is not that tmp and the OS seem to rely on it to work properly, as a result, any mapping to /tmp such as -v /whatever/folder:/tmp will result in troubles.

mtrmac commented 1 year ago

Reproduced.

@Luap99 You were right, the failure is

socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0) = 7
fchmod(7, 0700)                         = 0
bind(7, {sa_family=AF_UNIX, sun_path="/tmp/conmon-term.JAIG51"}, 110) = -1 EOPNOTSUPP (Operation not supported)
write(2, "[conmon:e]: Failed to bind to co"..., 69) = 69

This child process is reporting the error.

But that report is not visible because https://github.com/containers/conmon/blob/08c34bda8c75a37f153dfbd63399d22050551053/src/conmon.c#L131 has redirected stderr to /dev/null.

I’m sure there is some strategy for conmon error handling, so at this point I’d prefer t let conmon experts weigh in.


Regardless, I’d say that sharing /tmp across machines is risky. Frequently enough processes tend to assume (without a very reason) that some names are exclusive to them, and to that machine (consider the hard-coded X11 socket paths, or perhaps something using PID files (with no machine ID) to disambiguate).

I don’t know how much effort it makes sense to spend on supporting this.