containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
22.58k stars 2.31k forks source link

Refreshing container <containerID>: error acquiring lock 0 for container <containerID>: file exists #16784

Open Adelcelevator opened 1 year ago

Adelcelevator commented 1 year ago

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

This bug is an old know, I try to fix it like the other people resolve, but it still persist, When I login via ssh, I start some containers and use it, before I close the session that conteiners are still up, but when I logout, or close the session, or lose the ssh session, the conteiners go down automaticly and when I enter to the server again and put the podman ps command I found the next message in the top of the output:

Steps to reproduce the issue:

  1. Have a server with the next especification
NAME="Rocky Linux"
VERSION="9.1 (Blue Onyx)"
ID="rocky"
ID_LIKE="rhel centos fedora"
VERSION_ID="9.1"
PLATFORM_ID="platform:el9"
PRETTY_NAME="Rocky Linux 9.1 (Blue Onyx)"
ANSI_COLOR="0;32"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:rocky:rocky:9::baseos"
HOME_URL="https://rockylinux.org/"
BUG_REPORT_URL="https://bugs.rockylinux.org/"
ROCKY_SUPPORT_PRODUCT="Rocky-Linux-9"
ROCKY_SUPPORT_PRODUCT_VERSION="9.1"
REDHAT_SUPPORT_PRODUCT="Rocky Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="9.1"
Rocky Linux release 9.1 (Blue Onyx)
Rocky Linux release 9.1 (Blue Onyx)
Rocky Linux release 9.1 (Blue Onyx) 

RAM: 8GB
SWAP: Disabled
Number of CPU: 4
SELinux: Enforcing
  1. Access to the server via ssh.

  2. Create a container with podman.

  3. Be sure that you container is running.

  4. Close your session, disconected from the server.

  5. Login into the server server again via ssh and the container will be in Exited state and with the error message at the begin of the output of podman ps command.

Describe the results you received: Whe I do follow the previous steps I get the next output:

 podman ps -a --log-level=DEBUG
INFO[0000] podman filtering at log level debug
DEBU[0000] Called ps.PersistentPreRunE(podman ps -a --log-level=DEBUG)
DEBU[0000] Merged system config "/usr/share/containers/containers.conf"
DEBU[0000] Using conmon: "/usr/bin/conmon"
DEBU[0000] Initializing boltdb state at /home/panchito/.local/share/containers/storage/libpod/bolt_state.db
DEBU[0000] systemd-logind: Unknown object '/'.
DEBU[0000] Using graph driver overlay
DEBU[0000] Using graph root /home/panchito/.local/share/containers/storage
DEBU[0000] Using run root /run/user/1000/containers
DEBU[0000] Using static dir /home/panchito/.local/share/containers/storage/libpod
DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp
DEBU[0000] Using volume path /home/panchito/.local/share/containers/storage/volumes
DEBU[0000] Set libpod namespace to ""
DEBU[0000] Not configuring container store
DEBU[0000] Initializing event backend journald
DEBU[0000] Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument
DEBU[0000] Configured OCI runtime krun initialization failed: no valid executable found for OCI runtime krun: invalid argument
DEBU[0000] Configured OCI runtime runc initialization failed: no valid executable found for OCI runtime runc: invalid argument
DEBU[0000] Configured OCI runtime runj initialization failed: no valid executable found for OCI runtime runj: invalid argument
DEBU[0000] Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument
DEBU[0000] Using OCI runtime "/usr/bin/crun"
DEBU[0000] systemd-logind: Unknown object '/'.
DEBU[0000] Invalid systemd user session for current user
INFO[0000] podman filtering at log level debug
DEBU[0000] Called ps.PersistentPreRunE(podman ps -a --log-level=DEBUG)
DEBU[0000] Merged system config "/usr/share/containers/containers.conf"
DEBU[0000] Using conmon: "/usr/bin/conmon"
DEBU[0000] Initializing boltdb state at /home/panchito/.local/share/containers/storage/libpod/bolt_state.db
DEBU[0000] systemd-logind: Unknown object '/'.
DEBU[0000] Using graph driver overlay
DEBU[0000] Using graph root /home/panchito/.local/share/containers/storage
DEBU[0000] Using run root /run/user/1000/containers
DEBU[0000] Using static dir /home/panchito/.local/share/containers/storage/libpod
DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp
DEBU[0000] Using volume path /home/panchito/.local/share/containers/storage/volumes
DEBU[0000] Set libpod namespace to ""
DEBU[0000] [graphdriver] trying provided driver "overlay"
DEBU[0000] overlay: test mount with multiple lowers succeeded
DEBU[0000] Cached value indicated that overlay is supported
DEBU[0000] overlay: test mount indicated that metacopy is not being used
DEBU[0000] backingFs=extfs, projectQuotaSupported=false, useNativeDiff=true, usingMetacopy=false
DEBU[0000] Initializing event backend journald
DEBU[0000] Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument
DEBU[0000] Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument
DEBU[0000] Configured OCI runtime krun initialization failed: no valid executable found for OCI runtime krun: invalid argument
DEBU[0000] Configured OCI runtime runc initialization failed: no valid executable found for OCI runtime runc: invalid argument
DEBU[0000] Configured OCI runtime runj initialization failed: no valid executable found for OCI runtime runj: invalid argument
DEBU[0000] Using OCI runtime "/usr/bin/crun"
DEBU[0000] Podman detected system restart - performing state refresh
ERRO[0000] Refreshing container ac4c36f3d1a23215a46a8cc66216a0c28f1d1019b35135ccb341eec17b7f44f5: error acquiring lock 0 for container ac4c36f3d1a23215a46a8cc66216a0c28f1d1019b35135ccb341eec17b7f44f5: file exists
ERRO[0000] Refreshing volume e295cbb86f85d711a89371064a0795f8cca1bc712314475295526e925a356f97: acquiring lock 1 for volume e295cbb86f85d711a89371064a0795f8cca1bc712314475295526e925a356f97: file exists
INFO[0000] Setting parallel job count to 13
DEBU[0000] container ac4c36f3d1a23215a46a8cc66216a0c28f1d1019b35135ccb341eec17b7f44f5 has no defined healthcheck
CONTAINER ID  IMAGE                            COMMAND     CREATED      STATUS                  PORTS                 NAMES
ac4c36f3d1a2  docker.io/dpage/pgadmin4:latest              4 hours ago  Exited (0) 4 hours ago  0.0.0.0:8445->80/tcp  pgadminLatest
DEBU[0000] Called ps.PersistentPostRunE(podman ps -a --log-level=DEBUG)

Describe the results you expected:

I expected to have my container running.

Additional information you deem important (e.g. issue happens only occasionally): Before I enable SELinux and reboot the system it was working fine, after I enable SELinux and reboot the system for an autorelabel it fall, and I was looking for alerts in SELinux and put SELinux in Permissive mode, it still persist with the problem.

Output of podman version:

$ podman --version
podman version 4.2.0

Output of podman info:

$ podman info
host:
  arch: amd64
  buildahVersion: 1.27.1
  cgroupControllers:
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.4-1.el9.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.4, commit: fd49ef99363f06fe6b6ab119070cd95c6cc7c35a'
  cpuUtilization:
    idlePercent: 99.72
    systemPercent: 0.12
    userPercent: 0.16
  cpus: 4
  distribution:
    distribution: '"rocky"'
    version: "9.1"
  eventLogger: journald
  hostname: pruebas
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.14.0-162.6.1.el9_1.0.1.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 7633539072
  memTotal: 8053719040
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.5-1.el9.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.5
      commit: 54ebb8ca8bf7e6ddae2eb919f5b82d1d96863dea
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_NET_RAW,CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-2.el9.x86_64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.4.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.2
  swapFree: 0
  swapTotal: 0
  uptime: 27h 27m 50.00s (Approximately 1.12 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - registry.access.redhat.com
  - registry.redhat.io
  - docker.io
store:
  configFile: /home/panchito/.config/containers/storage.conf
  containerStore:
    number: 1
    paused: 0
    running: 0
    stopped: 1
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/panchito/.local/share/containers/storage
  graphRootAllocated: 208172843008
  graphRootUsed: 3986575360
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 1
  runRoot: /run/user/1000/containers
  volumePath: /home/panchito/.local/share/containers/storage/volumes
version:
  APIVersion: 4.2.0
  Built: 1668531608
  BuiltTime: Tue Nov 15 18:00:08 2022
  GitCommit: ""
  GoVersion: go1.18.4
  Os: linux
  OsArch: linux/amd64
  Version: 4.2.0

Package info (e.g. output of rpm -q podman or apt list podman or brew info podman):

$ dnf info podman
Last metadata expiration check: 0:01:13 ago on Thu 08 Dec 2022 02:50:30 AM CET.
Installed Packages
Name         : podman
Epoch        : 2
Version      : 4.2.0
Release      : 7.el9_1
Architecture : x86_64
Size         : 41 M
Source       : podman-4.2.0-7.el9_1.src.rpm
Repository   : @System
From repo    : appstream
Summary      : Manage Pods, Containers and Container Images
URL          : https://podman.io/
License      : ASL 2.0 and GPLv3+
Description  : podman (Pod Manager) is a fully featured container engine that is a simple
             : daemonless tool.  podman provides a Docker-CLI comparable command line that
             : eases the transition from other container engines and allows the management of
             : pods, containers and images.  Simply put: alias docker=podman.
             : Most podman commands can be run as a regular user, without requiring
             : additional privileges.
             :
             : podman uses Buildah(1) internally to create container images.
             : Both tools share image (not container) storage, hence each can use or
             : manipulate images (but not containers) created by the other.
             :
             : Manage Pods, Containers and Container Images
             : podman Simple management tool for pods, containers and images

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):

Is over contabo provider.

Luap99 commented 1 year ago

@mheon PTAL

The problem is that podman thinks you rebooted since systemd will clean up all tmpfs files under /run/user/$UID when you close your session. However the locks are stored in /dev/shm/libpod_rootless_lock_$UID which is not removed, so when you log in again podman thinks the locks in that file are already taken and errors out. As a workaround it should be enough to remove that file but make sure to only do this before you run any podman commands.

mheon commented 1 year ago

This is usually why we recommend loginctl enable-linger for users running Podman processes - avoiding systemd session cleanup avoids the problem entirely.

I suppose we could make the refresh logic auto-remove the shared-memory locks to force recreation, but I feel like we're just scratching the surface of the problems here. Detecting a reboot when one did not occur seems ripe to cause issues with lingering files and state that we don't expect.

Luap99 commented 1 year ago

Could we move the lock shm file into a "normal" file on tmpfs (runroot) instead? I run into issues like this several times while playing around with different --root/--runroot for testing, they all end up sharing the same lock file which feels wrong? I don't think there is a difference other than calling regular open() vs shm_open() assuming the file we open is actually on a tmpfs.

mheon commented 1 year ago

What we're doing now is more POSIX-compatible than that, AFAIK, so this might matter for FreeBSD? Otherwise, I don't see a reason why we couldn't do this.

Adelcelevator commented 1 year ago

Hi @Luap99, I execute podman system reset, in theory it will solve the problem but it persist when I create new containers, When I logout, after the creation of the containers, and I login again, the same problem have been printed when I execute podman ps -a

github-actions[bot] commented 1 year ago

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] commented 1 year ago

A friendly reminder that this issue had no activity for 30 days.

JesterBee commented 1 year ago

This issue interests me also. I'm running into a comparable error.

rhatdan commented 12 months ago

@mheon what should we do with this one?

mheon commented 11 months ago

Moving the SHM file seems reasonable, but low-priority.

Adelcelevator commented 10 months ago

Hello, a solution I found is to use loginctl enable-linger [users owner of containers], with this the containers still working when user logout of the ssh session, idk if this is a solution but it works.

pavinjosdev commented 8 months ago

@Adelcelevator This workaround would not work after a system reboot. Though, the issue is not too much of a problem as the errors happen only once on a fresh bootup.

pavin@lmde:~$ distrobox ls
ERRO[0000] Refreshing container 7d553e20b5af16e6252d3ed1d90e5b58ffa200cfc916d50fb695e6b9b1ee6122: acquiring lock 0 for container 7d553e20b5af16e6252d3ed1d90e5b58ffa200cfc916d50fb695e6b9b1ee6122: file exists 
ERRO[0000] Refreshing volume 8a7bd54137e2dd29e2cf6a5ec7a1d478e448dd43260490acffd1599b9c622141: acquiring lock 1 for volume 8a7bd54137e2dd29e2cf6a5ec7a1d478e448dd43260490acffd1599b9c622141: file exists 
ERRO[0000] Refreshing volume 976730261e76505f7652bb39439e6d874fa882c0aaba126341dfd41c2a15910a: acquiring lock 2 for volume 976730261e76505f7652bb39439e6d874fa882c0aaba126341dfd41c2a15910a: file exists 
ID           | NAME                 | STATUS             | IMAGE                         
6f6dee40f25a | deb                  | Exited (143) 2 hours ago | quay.io/toolbx-images/debian-toolbox:12
c4e379d3e153 | deb-unprocess        | Exited (137) 2 days ago | quay.io/toolbx-images/debian-toolbox:12
fd6944ece16c | deb-unipc            | Exited (143) 2 days ago | quay.io/toolbx-images/debian-toolbox:12
e72fb6b8ff75 | deb-unnet            | Exited (143) 2 days ago | quay.io/toolbx-images/debian-toolbox:12
395ae2277973 | deb-undevsys         | Exited (143) 2 days ago | quay.io/toolbx-images/debian-toolbox:12
dba6d239a775 | deb-unall            | Exited (137) 31 hours ago | quay.io/toolbx-images/debian-toolbox:12
a8bbab296e02 | deb-init             | Exited (130) 2 hours ago | quay.io/toolbx-images/debian-toolbox:12
7d553e20b5af | arch-init            | Exited (130) 2 hours ago | quay.io/toolbx-images/archlinux-toolbox:latest
himanshugiripunje commented 5 months ago

*I have proven solution for this .

Version: podman version 4.2.0

How issue happened

Details

systemd is managing the boot process and system task management.

STEPS

cat /etc/systemd/system/podman.service

[Unit]
Description=Podman Api Service
Requires=podman.socket
After=podman.socket
Documentation=man:podman-system-service(1)
StartLimitIntervalSec=0
[Service]
Type=exec
KillMode=process
Envionment=LOGGING=-"--log-level=info"
ExecStart=/usr/bin/podman $LOGGING system service tcp:0.0.0.0:8080 --time=0
systemctl daemon-reload
systemctl enable podman.socket podman
 systemctl start podman.socket podman
igbins09 commented 4 months ago

WORKAROUND SOLUTION

This is what worked for me

PS. i think it only works for users running a rootless Podman container (i.e., a container that's run by a non-root user).

You want the container to keep running even after the user has logged out, you will need to enable lingering for that user. This is because rootless Podman containers are tied to the user session, and they will be stopped when the user logs out unless lingering is enabled

You can enable lingering for a user with the loginctl command. This allows the user's processes to continue running after the user has logged out.

Here's how you can do it:

  1. Run the following command:
sudo loginctl enable-linger username

Replace username with the name of the user for whom you want to enable lingering.

This command will create a file named after the user in the /var/lib/systemd/linger/ directory. This file signals to systemd that lingering is enabled for the user.

After you've enabled lingering, the user's processes will continue to run after the user has logged out, until they are explicitly stopped or the system is rebooted.

BUT IF YOU ARE RUNNING A ROOT PODMAN CONTAINER , YOU CAN USE A SYSTEMD USER SERVICE