containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
22.37k stars 2.31k forks source link

Podman creates a rundir with insufficient permissions #23062

Closed romanwoessner closed 1 day ago

romanwoessner commented 1 week ago

Issue Description

I have keepalived running on RHEL 9.4 which runs "podman ps" in a check script to monitor a rootless HAProxy container. Running the check script interactively in a bash works, but running it from within keepalived fails with an exit code 1. During debugging, I saw that podman creates a rundir in the user's home directory and then runs into an error - presumably due to insufficient permissions.

Steps to reproduce the issue

Steps to reproduce the issue

  1. Create an run a HAProxy Podman container, install keepalived in RHEL 9.4 and use the following configs. /etc/keepalived/keepalived.conf
    
    global_defs {
    script_user root
    enable_script_security
    }

vrrp_script haproxy_check { script "/usr/libexec/keepalived/haproxy_check.sh" interval 1 fall 2 rise 2 timeout 5 user ansible }

vrrp_instance VI_1 { interface ens33 state BACKUP priority 95 virtual_router_id 51 virtual_ipaddress { 193.10.10.5/24 } track_script { haproxy_check } }

/usr/libexec/keepalived/haproxy_check.sh

!/bin/bash

podman ps -f "name=haproxy" -f "status=running" -q | grep .

3. Restart keepalived `systemctl restart keepalived.service`

### Describe the results you received

Keepalived fails to run its check script:

Keepalived_vrrp[204351]: Script haproxy_check now returning 1 Keepalived_vrrp[204351]: VRRP_Script(haproxy_check) failed (exited with status 1) Keepalived_vrrp[204351]: (VI_1) Entering FAULT STATE Keepalived_vrrp[204351]: (VI_1) sent 0 priority Keepalived_vrrp[204351]: (VI_1) removing VIPs.

Modifying the check script to write its stdout and stderror to a file...

podman ps -f "name=haproxy" -f "status=running" -q > /home/ansible/check.log 2>&1

...reveals this error message:

level=error msg="unable to make rootless runtime: mkdir /home/ansible/rundir/containers: permission denied"

`ls -l` on `/home/ansible`:

drw------- 2 ansible ansible 6 21. Jun 08:43 rundir


What is the purpose of this rundir and why does podman create it beeing called from within the keepalived check script?

### Describe the results you expected

podman ps returns an exit code 0

### podman info output

```yaml
host:
  arch: amd64
  buildahVersion: 1.33.7
  cgroupControllers:
  - cpuset
  - cpu
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.10-1.el9.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.10, commit: fb8c4bf50dbc044a338137871b096eea8041a1fa'
  cpuUtilization:
    idlePercent: 94.79
    systemPercent: 2.08
    userPercent: 3.13
  cpus: 2
  databaseBackend: boltdb
  distribution:
    distribution: rhel
    version: "9.4"
  eventLogger: journald
  freeLocks: 2045
  hostname: #myhostname
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1111
      size: 1
    - container_id: 1
      host_id: 10000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1111
      size: 1
    - container_id: 1
      host_id: 10000
      size: 65536
  kernel: 5.14.0-427.20.1.el9_4.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 1436237824
  memTotal: 3836882944
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.10.0-3.el9_4.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.10.0
    package: netavark-1.10.3-1.el9.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.10.3
  ociRuntime:
    name: crun
    package: crun-1.14.3-1.el9.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.14.3
      commit: 1961d211ba98f532ea52d2e80f4c20359f241a98
      rundir: /run/user/1111/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  pasta:
    executable: ""
    package: ""
    version: ""
  remoteSocket:
    exists: true
    path: /run/user/1111/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.3-1.el9.x86_64
    version: |-
      slirp4netns version 1.2.3
      commit: c22fde291bb35b354e6ca44d13be181c76a0a432
      libslirp: 4.4.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.2
  swapFree: 6287257600
  swapTotal: 6287257600
  uptime: 186h 10m 21.00s (Approximately 7.75 days)
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.access.redhat.com
  - registry.redhat.io
  - docker.io
store:
  configFile: /home/ansible/.config/containers/storage.conf
  containerStore:
    number: 1
    paused: 0
    running: 1
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/ansible/.local/share/containers/storage
  graphRootAllocated: 32586407936
  graphRootUsed: 1450647552
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 3
  runRoot: /run/user/1111/containers
  transientStore: false
  volumePath: /home/ansible/.local/share/containers/storage/volumes
version:
  APIVersion: 4.9.4-rhel
  Built: 1714473991
  BuiltTime: Tue Apr 30 12:46:31 2024
  GitCommit: ""
  GoVersion: go1.21.9 (Red Hat 1.21.9-2.el9_4)
  Os: linux
  OsArch: linux/amd64
  Version: 4.9.4-rhel

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

No

Additional environment details

Additional environment details

Additional information

Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting

giuseppe commented 1 week ago

There is probably no user session for the ansible user so Podman fall backs to create the rundir in the home directory.

Are you sure you want to run podman as the ansible user and not as root?

If you really want it, I suggest enabling lingering mode for that user (loginctl enable-linger ansible)

romanwoessner commented 1 week ago

I want to have a rootless container. Therefore the HAProxy container is running in the user space of the the ansible user. Lingering mode is already activated for this user.

giuseppe commented 1 week ago

strange that the /home/ansible/rundir directory is created when there is a user session.

do you get the same error if you run sudo -u ansible /usr/libexec/keepalived/haproxy_check.sh manually?

Can you temporarily turn off selinux to see if it is blocking the access to the directory?

romanwoessner commented 1 week ago

Turning off SELinux, does not change the behavior. That is something I have already tested.

$ sestatus
SELinux status:                 disabled

Running the script as ansible user works as exptected:

$ /usr/libexec/keepalived/haproxy_check.sh
59f82621245d

Yes, I get the same error running it manually with sudo -u ansible:

$ sudo -u ansible /usr/libexec/keepalived/haproxy_check.sh
ERRO[0000] unable to make rootless runtime: mkdir /home/ansible/rundir/containers: permission denied

But, after stopping keepalived and removing the rundir with its 0600 permissions, it works:

$ sudo systemctl stop keepalived.service
$ rm -rf ~/rundir
$ sudo -u ansible /usr/libexec/keepalived/haproxy_check.sh
59f82621245d
$ ls -la ~/rundir
drwx------   3 ansible ansible   24 25. Jun 09:49 .
drwx------. 12 ansible ansible 4096 25. Jun 09:49 ..
drwx------   2 ansible ansible    6 25. Jun 09:49 containers

In this case, it gets created with sufficient permissions 0700.

As soon as I remove the rundir and start the keepalived service again, Podman recreates it with insufficient permissions:

$ rm -rf ~/rundir
$ sudo systemctl start keepalived.service
$ sudo ls -la /home/ansible/rundir
drw-------   2 ansible ansible    6 25. Jun 09:51 .
drwx------. 12 ansible ansible 4096 25. Jun 09:51 ..

It seems that only running the script from within keepalived causes this issue. I appreciate any help.

giuseppe commented 1 week ago

it smells like keepalived is using the wrong umask.

Can you try to override the umask value to something like 0022?

EDIT:

if https://github.com/acassen/keepalived/blob/master/lib/utils.c#L73 is the default umask used by keepalived, then that explains the missing exec bit set for the directory

giuseppe commented 6 days ago

@romanwoessner had a chance to try overriding the umask value?

romanwoessner commented 2 days ago

Thanks for the hint! I have tried overriding the umask and it works.

global_defs {
    ....
    umask 022
}

I am still wondering why the rundir is created in the home directory. I have other RHEL machines with the same versions of podman and keepalived that behave differently and don't need this customization in the configuration.

giuseppe commented 1 day ago

the rundir is created in the home directory when Podman cannot create it under the user run directory (/run/user/$UID), so please make sure that directory is usable when Podman starts.

I suggest to report an issue to keepalived as well, since the default umask prevents the owner itself to access the created directories.

I am closing the issue as it appears the problem is not in Podman, but feel free to comment further