Unable to start containers after a forced shutdown

lewo commented 4 years ago

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

Steps to reproduce the issue:

Run the following container:

sudo /usr/bin/podman run --rm -d --name atomix-1 -p 5679:5679 -it -v /opt/onos/config:/etc/atomix/conf -v /var/lib/atomix-1/data:/var/lib/atomix/data:Z atomix/atomix:3.1.5 --config /etc/atomix/conf/atomix-1.conf --ignore-resources --data-dir /var/lib/atomix/data --log-level WARN

Selected forced shutdown of the VM:

sudo virsh destroy

Rerun the VM and start the container

Describe the results you received: Container failed to start with error readlink below: no such file or directory. This occurs approximate 1 in 5 forced shutdowns

sudo /usr/bin/podman run --rm -d --name atomix-1 -p 5679:5679 -it -v /opt/onos/config:/etc/atomix/conf -v /var/lib/atomix-1/data:/var/lib/atomix/data:Z atomix/atomix:3.1.5 --config /etc/atomix/conf/atomix-1.conf --ignore-resources --data-dir /var/lib/atomix/data --log-level WARN Error: readlink /var/lib/containers/storage/overlay/l/QRPHWAOMUOP7RQXQKPUY4Y7I3Z: no such file or directory

sudo podman inspect localhost/atomix/atomix:3.1.5 Error: error parsing image data "57ddcf43f4ac8f399810d4b44ded2c3a63e5abfb672bc447c3aa0f18e39a282c": readlink /var/lib/containers/storage/overlay/l/GMVU2BJI2CBP6Z2DFDEHCCZGTD: no such file or directory

Describe the results you expected: Container starts correctly

Additional information you deem important (e.g. issue happens only occasionally): The only work around seems to be to delete the image and re pull:

sudo podman rm -f atomix/atomix:3.1.5 sudo podman pull atomix/atomix:3.1.5

Output of podman version:

Version:            1.9.0
RemoteAPI Version:  1
Go Version:         go1.12.12
OS/Arch:            linux/amd64

Output of podman info --debug:

debug:
  compiler: gc
  gitCommit: ""
  goVersion: go1.12.12
  podmanVersion: 1.9.0
host:
  arch: amd64
  buildahVersion: 1.14.8
  cgroupVersion: v1
  conmon:
    package: conmon-2.0.15-2.3.el8.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.15, commit: ceb15924831eac767b6938880570e048ff787d0d'
  cpus: 2
  distribution:
    distribution: '"centos"'
    version: "8"
  eventLogger: journald
  hostname: tcn
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1001
      size: 1
    - container_id: 1
      host_id: 165536
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1001
      size: 1
    - container_id: 1
      host_id: 165536
      size: 65536
  kernel: 4.18.0-147.8.1.el8_1.x86_64
  memFree: 1927327744
  memTotal: 3964665856
  ociRuntime:
    name: runc
    package: runc-1.0.0-15.4.el8.x86_64
    path: /usr/bin/runc
    version: |-
      runc version 1.0.0-rc10
      commit: c2df86ba3af1e210a0f9d745df96e4329e3e6808
      spec: 1.0.1-dev
  os: linux
  rootless: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.0.0-4.2.el8.x86_64
    version: |-
      slirp4netns version 1.0.0
      commit: a3be729152a33e692cd28b52f664defbf2e7810a
      libslirp: 4.2.0
  swapFree: 4260356096
  swapTotal: 4260356096
  uptime: 24m 19.52s
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - registry.centos.org
  - docker.io
store:
  configFile: /home/tcnbuild/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: vfs
  graphOptions: {}
  graphRoot: /home/tcnbuild/.local/share/containers/storage
  graphStatus: {}
  imageStore:
    number: 0
  runRoot: /run/user/1001/containers
  volumePath: /home/tcnbuild/.local/share/containers/storage/volumes

Package info (e.g. output of rpm -q podman or apt list podman):

podman-1.9.0-1.2.el8.x86_64

Additional environment details (AWS, VirtualBox, physical, etc.): KVM CentOS 8.1 Guest VM running latest stable podman.

github-actions[bot] commented 4 years ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 4 years ago

This issue seems to have been lost. Sorry about that, are you still having issues with this?

lewo commented 4 years ago

Hi, yes occasionally. Seems either the ip config file can be left dangling or a reference to the image is left.

Leigh

On 28 May 2020, at 01:51, Daniel J Walsh notifications@github.com wrote:

This issue seems to have been lost. Sorry about that, are you still having issues with this?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

mheon commented 4 years ago

I was under the impression the symlink issue with images was resolved already in c/storage, but that sounds to be incorrect

rhatdan commented 4 years ago

@nalind PTAL

nalind commented 4 years ago

When we get this error, I think that calling recreateSymlinks(), or a version of it that only cared about the link for the specific layer whose link we couldn't read, would work around this.

mheon commented 4 years ago

Is there an easy way to determine if the error in question is an error that would require recreateSymlinks()? Alternatively, how bad is this, performance-wise - I could force it to run every time Podman detects a reboot...

nalind commented 4 years ago

It looks like the error that's coming back from Readlink() in this case would causeos.IsNotExist() to return true.

rhatdan commented 4 years ago

@lewo @nalind @mheon What should we do with this issue?

banool commented 3 years ago

I am seeing this issue too.

$ podman version
Version:      2.1.1
API Version:  2.0.0
Go Version:   go1.13.15
Built:        Fri Oct  2 07:30:39 2020
OS/Arch:      linux/amd64

$ podman info --debug
host:
  arch: amd64
  buildahVersion: 1.16.1
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: conmon-2.0.21-1.el8.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.21, commit: fa5f92225c4c95759d10846106c1ebd325966f91-dirty'
  cpus: 2
  distribution:
    distribution: '"centos"'
    version: "8"
  eventLogger: journald
  hostname: littlesally
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 4.18.0-193.19.1.el8_2.x86_64
  linkmode: dynamic
  memFree: 1683558400
  memTotal: 3798777856
  ociRuntime:
    name: runc
    package: runc-1.0.0-65.rc10.module_el8.2.0+305+5e198a41.x86_64
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.1-dev'
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  rootless: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-0.4.2-3.git21fdece.module_el8.2.0+305+5e198a41.x86_64
    version: |-
      slirp4netns version 0.4.2+dev
      commit: 21fdece2737dc24ffa3f01a341b8a6854f8b13b4
  swapFree: 4102025216
  swapTotal: 4102025216
  uptime: 7h 38m 38.17s (Approximately 0.29 days)
registries:
  search:
  - registry.access.redhat.com
  - registry.redhat.io
  - docker.io
store:
  configFile: /home/daniel/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-0.7.2-5.module_el8.2.0+305+5e198a41.x86_64
      Version: |-
        fuse-overlayfs: version 0.7.2
        FUSE library version 3.2.1
        using FUSE kernel interface version 7.26
  graphRoot: /home/daniel/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 12
  runRoot: /run/user/1000
  volumePath: /home/daniel/.local/share/containers/storage/volumes
version:
  APIVersion: 2.0.0
  Built: 1601649039
  BuiltTime: Fri Oct  2 07:30:39 2020
  GitCommit: ""
  GoVersion: go1.13.15
  OsArch: linux/amd64
  Version: 2.1.1

$ /usr/bin/podman run -a stdout -a stderr --cgroups no-conmon --conmon-pidfile /run/user/1000/team-heist-tactics.service-pid --cidfile /run/user/1000/team-heist-tactics.service-cid -v /var/www/team_heist_tactics_static:/bindmounted_static --publish 127.0.0.1:19996:19996 --name team-heist-tactics docker.pkg.github.com/banool/team_heist_tactics/team_heist_tactics:latest
Error: readlink /home/daniel/.local/share/containers/storage/overlay/l/YAUGQXTCBOZLL5DOMFTOX6KLBI: no such file or directory

The workaround for me was to delete the image:

podman image rm -f docker.pkg.github.com/banool/team_heist_tactics/team_heist_tactics:latest

rhatdan commented 3 years ago

Any idea how you got into this state? Do you have a reproducer?

yangm97 commented 3 years ago

@rhatdan This happened for me when I had an orange pi zero with its underpowered sdcard to try pulling & launching around 7 containers simultaneously. Of course the little guy overheated and/or kernel panicked.

Whatever happened caused a forced shutdown during pulling/creation/startup of containers, possibly had containers in most of these stages, since 2 of those images were relatively small and were probably into later stages than the others.

So high load, multiple concurrent operations and unclean shutdown. Maybe a high ext4 commit will cause this issue to reproduce more reliably.

BrysonMcI commented 3 years ago

I don't have a reliable reproduction @rhatdan , but I hit it frequently enough using containers running couchdb during any shutdown/reboot. It seems to be more likely when the shutdown comes as a power off of a VM

rhatdan commented 3 years ago

@giuseppe Could this be fuse-overlay related, or is this just a partial removal from container storage that is causing this problem?

giuseppe commented 3 years ago

@rhatdan I don't think it is related to fuse-overlayfs. Generally the storage can get corrupted on a forced shutdown and the missing symlinks is just one symptom. What I am worried the most about is that images could be corrupted as well (e.g. missing or incomplete files) and this is difficult to detect.

When running in a cluster, on the next node boot CRI-O wipes out the entire storage if the node wasn't stopped cleanly. I think this is still the safest we can do now, until we will have something like "podman storage fsck" that can verify that each file in the images is not corrupted and if needed re-pull the image.

rhatdan commented 3 years ago

How difficult would it be to reassemble the storage with an fsck option? The difference between CRI-O and Podman is blowing away of containers, could mean loss of a serious amount of work. Think toolbox containers.

yangm97 commented 3 years ago

What I am worried the most about is that images could be corrupted as well (e.g. missing or incomplete files) and this is difficult to detect

I can confirm this is a thing that happens. I've seen some applications crashing for no apparent reason, which turned out to be fixed by removing and re-pulling the image (same digest). But again, no reproducer.

Does podman support any sort of read-only rootfs setup? Like storing images in a partition which gets mounted as ro? Or even the whole rootfs mounted as ro.

giuseppe commented 3 years ago

How difficult would it be to reassemble the storage with an fsck option? The difference between CRI-O and Podman is blowing away of containers, could mean loss of a serious amount of work. Think toolbox containers.

we would need to checksum each file in the image. It would get us closer to the OSTree storage model. OSTree has a fsck operation that works this way.

Alternatively, more expensive in terms of I/O, we record the image is pulled only after we do a syncfs().

Does podman support any sort of read-only rootfs setup? Like storing images in a partition which gets mounted as ro? Or even the whole rootfs mounted as ro.

You can use an additional store that works exactly how you described it. The entire storage is on a read-only partition and tell Podman to use it with:

additionalimagestores = [
     "/path/to/the/storage"
]

in the storage.conf file

rhatdan commented 3 years ago

https://www.redhat.com/sysadmin/image-stores-podman

github-actions[bot] commented 3 years ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 3 years ago

@nalind Any movement on this?

nalind commented 3 years ago

Sorry, been focused on other bugs.

PavelSosin-320 commented 3 years ago

I just faced this issue in a very strange situation: Background: I run podman 2.1.1 on Ubuntu 20.04 WSL distro Steps to reproduce:

sudo -i -> switch to root
Create pod containing single nginx container
Create pod containing VSCode dev-container and node container as a side-car
podman image inspect nginx -> everything is OK
Podman image node -> Error: readlink /var/lib/containers/storage/overlay: invalid argument
podman pull node
podma image inspect node > the same error
buildah pull node
buildah inspect node -> wonderfull! Everyting is OK! This is obviously the issue of how image is stored.

github-actions[bot] commented 3 years ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 3 years ago

This problem still exists.

PavelSosin-320 commented 3 years ago

A similar issue exists after a simple Linux sudo reboot. Neither locks nor ports are released. But the correct systemctl reboot works OK. It looks like systemd's issue, not a Podman's Entire discussion about power off is here Shurdown after power off/on

LordPraslea commented 3 years ago

I too can confirm that this issue has just recently occurred and is still happening. It's kind of nasty. What makes this bad is the fact that the container is being started by systemd. People will scratch their heads for a LONG TIME before finding a solution.

There was a surge in the zone over the weekend which resulted in a short loss of electricity. I had recently moved the raspberrypi I use for testing and it had no UPS. Only one container seems to have been affected a rootfull haproxy. The other rootless containers seem to have not malfunctioned.

The solution was indeed to pull haproxy again. Not really something I'd want to do in production considering that haproxy and podman are both key components.

`$ sudo podman info --debug host: arch: arm buildahVersion: 1.16.1 cgroupManager: systemd cgroupVersion: v1 conmon: package: 'conmon: /usr/libexec/podman/conmon' path: /usr/libexec/podman/conmon version: 'conmon version 2.0.20, commit: ' cpus: 4 distribution: distribution: raspbian version: "10" eventLogger: journald hostname: raspberrypi idMappings: gidmap: null uidmap: null kernel: 5.4.72-v7l+ linkmode: dynamic memFree: 3287650304 memTotal: 4013862912 ociRuntime: name: runc package: 'runc: /usr/sbin/runc' path: /usr/sbin/runc version: |- runc version 1.0.0~rc6+dfsg1 commit: 1.0.0~rc6+dfsg1-3 spec: 1.0.1 os: linux remoteSocket: path: /run/podman/podman.sock rootless: false slirp4netns: executable: "" package: "" version: "" swapFree: 104853504 swapTotal: 104853504 uptime: 1h 15m 47.34s (Approximately 0.04 days) registries: search:

docker.io
quay.io store: configFile: /etc/containers/storage.conf containerStore: number: 1 paused: 0 running: 1 stopped: 0 graphDriverName: overlay graphOptions: {} graphRoot: /var/lib/containers/storage graphStatus: Backing Filesystem: extfs Native Overlay Diff: "true" Supports d_type: "true" Using metacopy: "false" imageStore: number: 2 runRoot: /var/run/containers/storage volumePath: /var/lib/containers/storage/volumes version: APIVersion: 2.0.0 Built: 0 BuiltTime: Thu Jan 1 01:00:00 1970 GitCommit: "" GoVersion: go1.15.2 OsArch: linux/arm Version: 2.1.1 `

umohnani8 commented 3 years ago

Issue fixed in c/storage by https://github.com/containers/storage/pull/822

rhatdan commented 3 years ago

@umohnani8 Please backport containers/storage#822 to v1.26 so we can update the vendor in podman to fix this in podman 3.0.

umohnani8 commented 3 years ago

@rhatdan the patch is already in c/storage v1.26

rhatdan commented 3 years ago

Did you open a PR to vendor in an update?

vrothberg commented 3 years ago

I don't think that'll pass CI due to the libcap farts, see https://github.com/containers/podman/pull/9462

w4tsn commented 3 years ago

I've also hit this on podman 2.2.1 on Fedora IoT. I want to emphasize that this is really problematic in low-bandwidth IoT use-cases with unstable power supply as I'm facing right now: when operating podman on devices at sites with slow connection speeds, limits (around 500M to 1G per Month) or a per KB/MB billing model a corrupted storage and re-downloading 500MB nodejs images is either impossible or at least lethal on site.

For this it's crucial to store all images on update in a dedicated, read-only storage like giuseppe suggested earlier.

Just mentioning it because it caused a lot of headache already in the past and maybe others hitting this with a similar use-case can benefit from the idea.

As a side-note on ostree based systems: I think it's a viable approach for those scenarios to commit the images into the ostree which also allows diff-based container updates, since ostree comes with the ability to do delta updates. It also ties the container image state / version to the OS version, which is also a nice property IMO.

umohnani8 commented 3 years ago

The storage fix made it into podman 3.1.0-rc1. @rhatdan do we plan to backport this fix for 2.2.1 as well?

rhatdan commented 3 years ago

no.

umohnani8 commented 3 years ago

Okay, this is fixed in podman 3.1.0-rc1 then, closing the issue now.

buck2202 commented 3 years ago

Just to clarify, I'm assuming that the ro-store workaround would not be sufficient for containers run as root, since it relies on filesystem permissions, right?

I'm getting hit with this fairly often using preemptible instances on google cloud. Since I have to expect random hard shutdowns, I'm already taking container checkpoints at regular intervals (which require root). My fairly overkill workaround to the random corruption is if after boot, any podman container inspect or podman image inspect returns a nonzero exitcode, I dump a list of containers, podman system reset, repull my images, and restore my container list from whatever checkpoints happen to be present.

My scripts seem to catch the corruption and allow recovery, but it's fairly aggressive.

yangm97 commented 3 years ago

@w4tsn and @buck2202, for an immediate workaround you could setup a read-only image store and make podman stateless as described here.

containers / podman

Unable to start containers after a forced shutdown #5986