containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.5k stars 2.39k forks source link

Rootless container exits immediately when run via a GitHub action but runs fine locally #10439

Closed DanHam closed 3 years ago

DanHam commented 3 years ago

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

When running a rootless container with Podman on Ubuntu-20.04 via GitHub actions the container immediately quits.

Starting the same container image with sudo podman ... (rootless: false) works fine via GitHub actions, as does running the container with Docker.

The same image runs fine (rootless: true) with an equivalent install (as close as possible) of Podman on an Ubuntu-20.04 VM.

Steps to reproduce the issue:

The issue can be consistently reproduced.

  1. Fork the GitHub repository that demonstrates the issue HERE.

  2. Go to the Actions tab

  3. Click on the Demo issueworkflow.

  4. Click on the Run workflow drop down on the right hand side of the screen and then click Run workflow.

Describe the results you received:

Instead of continuing to run in detached mode the container immediately quits - the STATUS field in the output of podman ps -a shows Exited (255)....

Describe the results you expected:

The container should continue to run in detached mode - the same way it does on my local system and in my Ubuntu 20.04 VM.

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

From the GitHub Action workflow debug output:

Version:      3.1.2
API Version:  3.1.2
Go Version:   go1.15.2
Built:        Thu Jan  1 00:00:00 1970
OS/Arch:      linux/amd64

Output of podman info --debug:

From the GitHub Action workflow debug output:

host:
  arch: amd64
  buildahVersion: 1.20.1
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: 'conmon: /usr/libexec/podman/conmon'
    path: /usr/libexec/podman/conmon
    version: 'conmon version 2.0.27, commit: '
  cpus: 2
  distribution:
    distribution: ubuntu
    version: "20.04"
  eventLogger: journald
  hostname: fv-az93-395
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 121
      size: 1
    - container_id: 1
      host_id: 165536
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1001
      size: 1
    - container_id: 1
      host_id: 165536
      size: 65536
  kernel: 5.4.0-1047-azure
  linkmode: dynamic
  memFree: 4737449984
  memTotal: 7292141568
  ociRuntime:
    name: crun
    package: 'crun: /usr/bin/crun'
    path: /usr/bin/crun
    version: |-
      crun version 0.19.1.3-9b83-dirty
      commit: 33851ada2cc9bf3945915565bf3c2df97facb92c
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /tmp/podman-run-1001/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    selinuxEnabled: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: 'slirp4netns: /usr/bin/slirp4netns'
    version: |-
      slirp4netns version 1.1.8
      commit: unknown
      libslirp: 4.3.1-git
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.4.3
  swapFree: 4294963200
  swapTotal: 4294963200
  uptime: 4m 43.88s
registries:
  search:
  - docker.io
  - quay.io
store:
  configFile: /home/runner/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: 'fuse-overlayfs: /usr/bin/fuse-overlayfs'
      Version: |-
        fusermount3 version: 3.9.0
        fuse-overlayfs: version 1.5
        FUSE library version 3.9.0
        using FUSE kernel interface version 7.31
  graphRoot: /home/runner/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 3
  runRoot: /tmp/podman-run-1001/containers
  volumePath: /home/runner/.local/share/containers/storage/volumes
version:
  APIVersion: 3.1.2
  Built: 0
  BuiltTime: Thu Jan  1 00:00:00 1970
  GitCommit: ""
  GoVersion: go1.15.2
  OsArch: linux/amd64
  Version: 3.1.2

Side by side diff of podman info --debug from Ubuntu-20.04 running via GitHub actions (left) and from Ubuntu-20.04 VM - shows differences in idMappings.

Could this be causing the issue?

host:                                                                     (
  arch: amd64                                                             (
  buildahVersion: 1.20.1                                                  (
  cgroupManager: cgroupfs                                                 (
  cgroupVersion: v1                                                       (
  conmon:                                                                 (
    package: 'conmon: /usr/libexec/podman/conmon'                         (
    path: /usr/libexec/podman/conmon                                      (
    version: 'conmon version 2.0.27, commit: '                            (
  cpus: 2                                                                 |    cpus: 1
  distribution:                                                           (
    distribution: ubuntu                                                  (
    version: "20.04"                                                      (
  eventLogger: journald                                                   (
  hostname: fv-az93-395                                                   |    hostname: focal
  idMappings:                                                             (
    gidmap:                                                               (
    - container_id: 0                                                     (
      host_id: 121                                                        |        host_id: 1000
      size: 1                                                             (
    - container_id: 1                                                     (
      host_id: 165536                                                     |        host_id: 100000
      size: 65536                                                         (
    uidmap:                                                               (
    - container_id: 0                                                     (
      host_id: 1001                                                       |        host_id: 1000
      size: 1                                                             (
    - container_id: 1                                                     (
      host_id: 165536                                                     |        host_id: 100000
      size: 65536                                                         (
  kernel: 5.4.0-1047-azure                                                |    kernel: 5.4.0-73-generic
  linkmode: dynamic                                                       (
  memFree: 4737449984                                                     |    memFree: 716386304
  memTotal: 7292141568                                                    |    memTotal: 2084315136
  ociRuntime:                                                             (
    name: crun                                                            (
    package: 'crun: /usr/bin/crun'                                        (
    path: /usr/bin/crun                                                   (
    version: |-                                                           (
      crun version 0.19.1.3-9b83-dirty                                    (
      commit: 33851ada2cc9bf3945915565bf3c2df97facb92c                    (
      spec: 1.0.0                                                         (
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL         (
  os: linux                                                               (
  remoteSocket:                                                           (
    path: /tmp/podman-run-1001/podman/podman.sock                         |      path: /run/user/1000/podman/podman.sock
  security:                                                               (
    apparmorEnabled: false                                                (
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KI (
    rootless: true                                                        (
    seccompEnabled: true                                                  (
    selinuxEnabled: false                                                 (
  slirp4netns:                                                            (
    executable: /usr/bin/slirp4netns                                      (
    package: 'slirp4netns: /usr/bin/slirp4netns'                          (
    version: |-                                                           (
      slirp4netns version 1.1.8                                           (
      commit: unknown                                                     (
      libslirp: 4.3.1-git                                                 (
      SLIRP_CONFIG_VERSION_MAX: 3                                         (
      libseccomp: 2.4.3                                                   (
  swapFree: 4294963200                                                    |    swapFree: 0
  swapTotal: 4294963200                                                   |    swapTotal: 0
  uptime: 4m 43.88s                                                       |    uptime: 8h 38m 4.59s (Approximately 0.33 days)
registries:                                                               (
  search:                                                                 (
  - docker.io                                                             (
  - quay.io                                                               (
store:                                                                    (
  configFile: /home/runner/.config/containers/storage.conf                |    configFile: /home/vagrant/.config/containers/storage.conf
  containerStore:                                                         (
    number: 0                                                             (
    paused: 0                                                             (
    running: 0                                                            (
    stopped: 0                                                            (
  graphDriverName: overlay                                                (
  graphOptions:                                                           (
    overlay.mount_program:                                                (
      Executable: /usr/bin/fuse-overlayfs                                 (
      Package: 'fuse-overlayfs: /usr/bin/fuse-overlayfs'                  (
      Version: |-                                                         (
        fusermount3 version: 3.9.0                                        (
        fuse-overlayfs: version 1.5                                       (
        FUSE library version 3.9.0                                        (
        using FUSE kernel interface version 7.31                          (
  graphRoot: /home/runner/.local/share/containers/storage                 |    graphRoot: /home/vagrant/.local/share/containers/storage
  graphStatus:                                                            (
    Backing Filesystem: extfs                                             (
    Native Overlay Diff: "false"                                          (
    Supports d_type: "true"                                               (
    Using metacopy: "false"                                               (
  imageStore:                                                             (
    number: 3                                                             (
  runRoot: /tmp/podman-run-1001/containers                                |    runRoot: /run/user/1000/containers
  volumePath: /home/runner/.local/share/containers/storage/volumes        |    volumePath: /home/vagrant/.local/share/containers/storage/volumes
version:                                                                  (
  APIVersion: 3.1.2                                                       (
  Built: 0                                                                (
  BuiltTime: Thu Jan  1 00:00:00 1970                                     (
  GitCommit: ""                                                           (
  GoVersion: go1.15.2                                                     (
  OsArch: linux/amd64                                                     (
  Version: 3.1.2                                                          (

Package info (e.g. output of rpm -q podman or apt list podman):

From the GitHub Action workflow debug output:

Listing...
podman/now 100:3.1.2-1 amd64 [installed,local]

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/master/troubleshooting.md)

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):

See the workflow file and debug output in the workflow run.

rhatdan commented 3 years ago

@jwhonce @baude PTAL

vrothberg commented 3 years ago

@DanHam, do you have access to logs and could share them? Running with --log-level=debug is a good start. Do you see something suspicious in the journal?

DanHam commented 3 years ago

@vrothberg Hi. Thanks for taking a look at this.

Were you unable to reproduce the issue by forking the demo repository I created?

Just in case you missed it, please see the points under 'Steps to Reproduce the Issue' above.

If you fork the demo repo you can run the GitHub action yourself. This will give you full access to all of the logs and allow you to 'tinker' to further diagnose the issue.

Please let me know if you are unable to do this for any reason and I will do my best to provide you with any logs/run further commands on your behalf to diagnose the problem.

vrothberg commented 3 years ago

Thanks, @DanHam! Yes, I saw the reproducers but am short on time juggling a number of issues in parallel at the moment. I may find more time tomorrow to look into it.

vrothberg commented 3 years ago

I followed the instructions and tried to run the workflow on my fork but nothing seems to happen. I get a popup stating "This workflow has a workflow_dispatch event trigger.".

@DanHam, do you know what to do? I have not much experience with GitHub Actions and feel I am missing the obvious.

DanHam commented 3 years ago

There should be a 'Run workflow' drop down/button on the right hand side. Click that. Then click the green 'Run Workflow' button

If you can't see that 'Run workflow' drop down then you may be on the wrong page. So:

  1. From the main repo select the 'Actions' tab.
  2. Under 'Workflows' -> All workflows -> Click on 'Demo issue'
  3. In the main part of the window you should see the 'This workflow has a workflow_dispatch event trigger' notification.
  4. Click on the 'Run workflow' drop down button and select 'Run workflow'
vrothberg commented 3 years ago

The results of the previous runs just showed up now. Looks like it takes a while; maybe related to the recent GitHub outage. Thanks :)

DanHam commented 3 years ago

I can see you've run the action a few times!! :smile: Hopefully, you will now be able to drill down into each step and see the output.

vrothberg commented 3 years ago

I played a bit with the GitHub action and saw the following logs in journal:

2021-06-02T10:49:19.3195458Z Jun 02 10:49:19 fv-az216-850 podman[3100]: 2021-06-02 10:49:19.090161684 +0000 UTC m=+0.059423896 container died 711d5123195406f5d392f2ba51c42674941b105495b11cce437ac9e3a93c3b33 (image=localhost/debian-10-systemd, name=deb10)
2021-06-02T10:49:19.3198839Z Jun 02 10:49:19 fv-az216-850 /usr/bin/podman[3100]: time="2021-06-02T10:49:19Z" level=debug msg="Failed to add podman to systemd sandbox cgroup: exec: \"dbus-launch\": executable file not found in $PATH"

Failed to add podman to systemd sandbox cgroup: exec: \"dbus-launch\": executable file not found in $PATH

@giuseppe, do you know why it works as root but not rootless?

DanHam commented 3 years ago

@vrothberg @giuseppe Just to reiterate - the problematic container runs fine rootless on my local system. It is only trying to run the container rootless via a GitHub Action that I see the issue

vrothberg commented 3 years ago

Thanks, @DanHam. It also runs fine on my local system. @giuseppe is the cgroups experts, and I am sure he knows what's going on. It could be that install dbus-launch will solve the problem but I wonder why we're not hitting the issue as root.

giuseppe commented 3 years ago

Does the rootless container have access to cgroups or systemd? I think we need to enforce --cgroup-manager cgroupfs

DanHam commented 3 years ago

It could be that install dbus-launch will solve the problem but I wonder why we're not hitting the issue as root.

Right... but also - why are we only seeing this issue rootless on GitHub actions? Why do we not have the same issue running rootless locally?

DanHam commented 3 years ago

@vrothberg @giuseppe

I think we need to enforce --cgroup-manager cgroupfs

Running with podman run -d --log-level=debug --cgroup-manager=cgroupfs --name deb10 localhost/debian-10-systemd has no effect - the issue still occurs.

It could be that install dbus-launch will solve the problem...

I've tried installing dbus-launch (this is provided by the dbus-x11 package). This does not solve the issue.

However, now we are getting a different error:

*** Output of journalctl -r
-- Logs begin at Thu 2021-05-27 08:00:13 UTC, end at Wed 2021-06-02 13:23:17 UTC. --
Jun 02 13:23:17 fv-az118-288 dbus-daemon[3285]: Cannot setup inotify for '/root/.local/share/dbus-1/services'; error 'Permission denied'
Jun 02 13:23:17 fv-az118-288 dbus-daemon[3285]: [session uid=0 pid=3283] AppArmor D-Bus mediation is enabled
Jun 02 13:23:17 fv-az118-288 dbus-daemon[3263]: Cannot setup inotify for '/root/.local/share/dbus-1/services'; error 'Permission denied'
Jun 02 13:23:17 fv-az118-288 dbus-daemon[3263]: [session uid=0 pid=3261] AppArmor D-Bus mediation is enabled
Jun 02 13:23:17 fv-az118-288 /usr/bin/podman[3234]: time="2021-06-02T13:23:17Z" level=debug msg="Called cleanup.PersistentPostRunE(/usr/bin/podman --root /home/runner/.local/share/containers/storage --runroot /tmp/podman-run-1001/containers --log-level debug --cgroup-manager cgroupfs --tmpdir /tmp/run-1001/libpod/tmp --runtime crun --storage-driver overlay --storage-opt overlay.mount_program=/usr/bin/fuse-overlayfs --events-backend journald --syslog container cleanup 2913aea33174ab23a57d9014cbc13836c5b308e6d018bcc7f5d43380828138ef)"
Jun 02 13:23:17 fv-az118-288 podman[3234]: 2021-06-02 13:23:17.484027782 +0000 UTC m=+0.101841018 container cleanup 2913aea33174ab23a57d9014cbc13836c5b308e6d018bcc7f5d43380828138ef (image=localhost/debian-10-systemd, name=deb10, io.buildah.version=1.21.0)
Jun 02 13:23:17 fv-az118-288 /usr/bin/podman[3234]: time="2021-06-02T13:23:17Z" level=debug msg="unmounted container \"2913aea33174ab23a57d9014cbc13836c5b308e6d018bcc7f5d43380828138ef\""
Jun 02 13:23:17 fv-az118-288 /usr/bin/podman[3234]: time="2021-06-02T13:23:17Z" level=debug msg="Successfully cleaned up container 2913aea33174ab23a57d9014cbc13836c5b308e6d018bcc7f5d43380828138ef"
Jun 02 13:23:17 fv-az118-288 /usr/bin/podman[3234]: time="2021-06-02T13:23:17Z" level=debug msg="Tearing down network namespace at /tmp/podman-run-1001/netns/cni-0605e3af-524c-c652-af69-d02d114bacf7 for container 2913aea33174ab23a57d9014cbc13836c5b308e6d018bcc7f5d43380828138ef"
Jun 02 13:23:17 fv-az118-288 /usr/bin/podman[3234]: time="2021-06-02T13:23:17Z" level=debug msg="Cleaning up container 2913aea33174ab23a57d9014cbc13836c5b308e6d018bcc7f5d43380828138ef"
Jun 02 13:23:17 fv-az118-288 podman[3234]: 2021-06-02 13:23:17.463187898 +0000 UTC m=+0.081001234 container died 2913aea33174ab23a57d9014cbc13836c5b308e6d018bcc7f5d43380828138ef (image=localhost/debian-10-systemd, name=deb10)
Jun 02 13:23:17 fv-az118-288 /usr/bin/podman[3234]: time="2021-06-02T13:23:17Z" level=debug msg="Failed to add podman to systemd sandbox cgroup: dbus: authentication failed"
Jun 02 13:23:17 fv-az118-288 dbus-daemon[3255]: Cannot setup inotify for '/root/.local/share/dbus-1/services'; error 'Permission denied'
Jun 02 13:23:17 fv-az118-288 dbus-daemon[3255]: [session uid=0 pid=3247] AppArmor D-Bus mediation is enabled

Note that dbus-x11 (and hence dbus-launch) is NOT installed on my local Ubuntu system where podman runs the rootless container fine.

I'm not convinced we should be focusing attention on the install of dbus-launch to solve this issue. Instead, I think we should be asking ourselves:

Why is there an attempt made to add podman to systemd sandbox cgroup: dbus when running podman in the environment provided by GitHub Actions when this does not happen locally?

giuseppe commented 3 years ago

Why is there an attempt made to add podman to systemd sandbox cgroup: dbus when running podman in the environment provided by GitHub Actions when this does not happen locally?

this is done when Podman is running in a cgroup not owned by the rootless user. This is done when running on systemd and cgroup v2.

--cgroup-manager is an option to podman, not podman run. Could you please try with podman --cgroup-manager=cgroupfs run -d --log-level=debug --name deb10 localhost/debian-10-systemd? Does the issue still happen?

DanHam commented 3 years ago

@giuseppe

--cgroup-manager is an option to podman, not podman run

Ah, OK! Sorry - should have spotted that.

Could you please try with podman --cgroup-manager=cgroupfs run -d --log-level=debug --name deb10 localhost/debian-10-systemd? Does the issue still happen?

So I've run again with podman --cgroup-manager=cgroupfs run --log-level=debug -d --name deb10 localhost/debian-10-systemd.

Unfortunately, this did not solve the issue. The error is identical to before.

See the results of the GitHub Action running with the --cgroup-manager=cgroupfs flag set HERE.

The results of the same GitHub Action without the --cgroup-manager=cgroupfs flag are HERE

With regard to the error: Failed to add podman to systemd sandbox cgroup: exec: \"dbus-launch\": executable file not found in $PATH. Looking at the debug output from podman, this appears to happen fairly early on. I was wondering if this is a terminal error or if podman just logs and ignores this?

giuseppe commented 3 years ago

That is just a debug statement.

I think the real failure is kernel: overlayfs: unrecognized mount option "userxattr" or missing value.

Podman is not correctly detecting support for overlay in a user namespace. This was fixed recently, and probably the fix is not yet in the Podman version you are using.

I'd suggest to force the usage of fuse-overlayfs with podman --storage-driver overlay --storage-opt overlay.mount_program=/usr/bin/fuse-overlayfs ...

DanHam commented 3 years ago

@giuseppe

I've tried again with podman --storage-driver overlay --storage-opt overlay.mount_program=/usr/bin/fuse-overlayfs.

See the output from the run HERE

As you can see this doesn't seem to help - the main issue still persists and the kernel: overlayfs: unrecognized mount option "userxattr" or missing value error (warning??) persists.

With regard to versions of various components, both the GitHub environment and my local Ubuntu VM share identical versions of all components I've looked at - e.g. podman, buildah, fuse-overlayfs etc. Clearly, in the GitHub environment the container fails to run, while in the Ubuntu VM it runs fine.

However, I do NOT see the kernel: overlayfs: unrecognized mount option "userxattr" or missing value error when I run the container in my Ubuntu VM.

Looking at the logs from previous runs (without setting the --storage-driver overlay... flags and the output of podman info --debug) it seems podman was using the overlay storage driver and /usr/bin/fuse-overlayfs as the mount program by default.

For reference see the diff output below. Output from the GitHub environment is on the left; Output from the Ubuntu VM (with just the differences shown) is on the right:

  arch: amd64                                                             (
  buildahVersion: 1.20.1                                                  (
  cgroupManager: cgroupfs                                                 (
  cgroupVersion: v1                                                       (
  conmon:                                                                 (
    package: 'conmon: /usr/libexec/podman/conmon'                         (
    path: /usr/libexec/podman/conmon                                      (
    version: 'conmon version 2.0.27, commit: '                            (
  cpus: 2                                                                 |    cpus: 1
  distribution:                                                           (
    distribution: ubuntu                                                  (
    version: "20.04"                                                      (
  eventLogger: journald                                                   (
  hostname: fv-az93-734                                                   |    hostname: focal
  idMappings:                                                             (
    gidmap:                                                               (
    - container_id: 0                                                     (
      host_id: 121                                                        |        host_id: 1000
      size: 1                                                             (
    - container_id: 1                                                     (
      host_id: 165536                                                     |        host_id: 100000
      size: 65536                                                         (
    uidmap:                                                               (
    - container_id: 0                                                     (
      host_id: 1001                                                       |        host_id: 1000
      size: 1                                                             (
    - container_id: 1                                                     (
      host_id: 165536                                                     |        host_id: 100000
      size: 65536                                                         (
  kernel: 5.4.0-1047-azure                                                |    kernel: 5.4.0-74-generic
  linkmode: dynamic                                                       (
  memFree: 5053829120                                                     |    memFree: 833081344
  memTotal: 7292145664                                                    |    memTotal: 2084319232
  ociRuntime:                                                             (
    name: crun                                                            (
    package: 'crun: /usr/bin/crun'                                        (
    path: /usr/bin/crun                                                   (
    version: |-                                                           (
      crun version 0.19.1.3-9b83-dirty                                    (
      commit: 33851ada2cc9bf3945915565bf3c2df97facb92c                    (
      spec: 1.0.0                                                         (
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL         (
  os: linux                                                               (
  remoteSocket:                                                           (
    path: /home/runner/.local/podman/podman.sock                          |      path: /run/user/1000/podman/podman.sock
  security:                                                               (
    apparmorEnabled: false                                                (
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KI (
    rootless: true                                                        (
    seccompEnabled: true                                                  (
    selinuxEnabled: false                                                 (
  slirp4netns:                                                            (
    executable: /usr/bin/slirp4netns                                      (
    package: 'slirp4netns: /usr/bin/slirp4netns'                          (
    version: |-                                                           (
      slirp4netns version 1.1.8                                           (
      commit: unknown                                                     (
      libslirp: 4.3.1-git                                                 (
      SLIRP_CONFIG_VERSION_MAX: 3                                         (
      libseccomp: 2.4.3                                                   (
  swapFree: 4294963200                                                    |    swapFree: 0
  swapTotal: 4294963200                                                   |    swapTotal: 0
  uptime: 3m 34.24s                                                       |    uptime: 1h 33m 18.97s (Approximately 0.04 days)
registries:                                                               (
  search:                                                                 (
  - docker.io                                                             (
  - quay.io                                                               (
store:                                                                    (
  configFile: /home/runner/.config/containers/storage.conf                |    configFile: /home/vagrant/.config/containers/storage.conf
  containerStore:                                                         (
    number: 1                                                             (
    paused: 0                                                             (
    running: 0                                                            |      running: 1
    stopped: 1                                                            |      stopped: 0
  graphDriverName: overlay                                                (
  graphOptions:                                                           (
    overlay.mount_program:                                                (
      Executable: /usr/bin/fuse-overlayfs                                 (
      Package: 'fuse-overlayfs: /usr/bin/fuse-overlayfs'                  (
      Version: |-                                                         (
        fusermount3 version: 3.9.0                                        (
        fuse-overlayfs: version 1.5                                       (
        FUSE library version 3.9.0                                        (
        using FUSE kernel interface version 7.31                          (
  graphRoot: /home/runner/.local/share/containers/storage                 |    graphRoot: /home/vagrant/.local/share/containers/storage
  graphStatus:                                                            (
    Backing Filesystem: extfs                                             (
    Native Overlay Diff: "false"                                          (
    Supports d_type: "true"                                               (
    Using metacopy: "false"                                               (
  imageStore:                                                             (
    number: 2                                                             (
  runRoot: /home/runner/.local/containers                                 |    runRoot: /run/user/1000/containers
  volumePath: /home/runner/.local/share/containers/storage/volumes        |    volumePath: /home/vagrant/.local/share/containers/storage/volumes
version:                                                                  (
  APIVersion: 3.1.2                                                       (
  Built: 0                                                                (
  BuiltTime: Thu Jan  1 00:00:00 1970                                     (
  GitCommit: ""                                                           (
  GoVersion: go1.15.2                                                     (
  OsArch: linux/amd64                                                     (
  Version: 3.1.2                                                          (

There doesn't seem to be any substantial differences between the two...

DanHam commented 3 years ago

kernel: overlayfs: unrecognized mount option "userxattr" or missing value

Is there anything further you can think of to try and diagnose if this is the root cause of our issue?

giuseppe commented 3 years ago

I am giving it a try, but I think the container is created correctly, then systemd exits immediately

giuseppe commented 3 years ago

yes, if you create the container without -d and using -t you can get more useful information:


Welcome to Debian GNU/Linux 10 (buster)!

Set hostname to <f9730d28b722>.
Failed to create /system.slice/runner-provisioner.service/init.scope control group: Permission denied
Failed to allocate manager object: Permission denied
[!!!!!!] Failed to allocate manager object.
Exiting PID 1...

That means systemd has no access to cgroups and it simply gives up.

systemd on cgroup v1 doesn't need access to all controllers, but it needs at least access to the named systemd hierarchy.

I am closing the issue because I don't think there is anything podman can do about it, but feel free to comment further here

TomSweeneyRedHat commented 3 years ago

@giuseppe this might be a good candidate for the known issues page?

DanHam commented 3 years ago

@giuseppe @TomSweeneyRedHat

I am closing the issue because I don't think there is anything podman can do about it

I'm agreed that this isn't being caused by podman. However, there is clearly something wrong here that limits the utility of podman within a GitHub Actions environment.


I have done a bit of further investigating to try to determine exactly why podman can run the container rootless locally in an Ubuntu 20.04 VM but not within the GitHub actions environment (which also runs an Ubuntu 20.04 VM).

Both are using cgroup v1 (legacy hierarchy) for systemd. Both have the exact same mount options.

$ mount | grep cgroup | grep systemd
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd)

Running the container with debug logging enabled for the systemd process running within the container shows the following:

For the local Ubuntu VM:

Found cgroup on /sys/fs/cgroup/systemd, legacy hierarchy
Using cgroup controller name=systemd. File system hierarchy is at /sys/fs/cgroup/systemd/user.slice/user-1000.slice/user@1000.service/user.slice/podman-32837.scope.
...

For the GitHub Actions environment:

Found cgroup on /sys/fs/cgroup/systemd, legacy hierarchy
Using cgroup controller name=systemd. File system hierarchy is at /sys/fs/cgroup/systemd/system.slice/runner-provisioner.service.
Failed to create /system.slice/runner-provisioner.service/init.scope control group: Permission denied
Failed to allocate manager object: Permission denied

Looking at the ownership and permissions on those folders:

For the local Ubuntu VM:

$ ls -ld /sys/fs/cgroup/systemd/user.slice/user-1000.slice/user@1000.service
drwxr-xr-x 6 vagrant vagrant 0 Jun  3 10:35 /sys/fs/cgroup/systemd/user.slice/user-1000.slice/user@1000.service

For the GitHub Actions environment:

*** Output of ls -ld /sys/fs/cgroup/systemd/system.slice/runner-provisioner.service
drwxr-xr-x 2 root root 0 Jun  4 10:59 /sys/fs/cgroup/systemd/system.slice/runner-provisioner.service

Clearly, the permissions on the folder within the GitHub Actions environment are the cause of our failing container - the GitHub Actions user (runner) running podman cannot write to that directory

@giuseppe

Am I right in saying that systemd creates user.slice/user-1000.slice/user@1000.service when the user logs in?

Clearly, within the GitHub Actions environment we don't actually log in so we don't get a writeable directory assigned to our user. As such, should the ownership or permissions be set to allow the GitHub Actions user/group (runner:docker) write permissions on /sys/fs/cgroup/systemd/system.slice/runner-provisioner.service.

Is this something that could be taken up with the GItHub Actions team?

DanHam commented 3 years ago

While I don't see this as a viable work around, I've tried (what I consider an ugly hack) of brute forcing the ownership of /sys/fs/cgroup/systemd/system.slice/runner-provisioner.service:

sudo chown -R $(id -un):$(id -gn) /sys/fs/cgroup/systemd/system.slice/runner-provisioner.service

This works and allows podman to successfully run the container in rootless mode. See the output of the GitHub actions run HERE

DanHam commented 3 years ago

@giuseppe @TomSweeneyRedHat @rhatdan

It seems others have come across and have been affected by this exact issue - see https://github.com/actions/virtual-environments/issues/3536

this might be a good candidate for the known issues page?

While I agree that this isn't caused by podman, the issue can be fixed by a simple chown so it seems a shame not to try and take this further and get it fixed.

Running sudo chown -R $(id -un):$(id -gn) /sys/fs/cgroup/systemd/system.slice/runner-provisioner.service prior to running podman fixes the issue and allows systemd containers to be run root-less by podman within the GitHub Actions environment.

@giuseppe @rhatdan

I was wondering if any of you could see any potential issues (operational or security) with making this the default within the GitHub virtual environment builds. To my mind this is akin to having ownership of the user.slice/user-1000.slice/user@1000.service that is automatically configured on login (?) within a 'normal' system.

rhatdan commented 3 years ago

I don't see an issue with it, other then podentially allowing the id user to chown the content, but that user already is allowed sudo, so I really don't see this as a problem.