containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.79k stars 2.42k forks source link

Listing containers sometimes fails because "container X does not exist in database" #10225

Closed sshnaidm closed 3 years ago

sshnaidm commented 3 years ago

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

Error appears sometimes when trying to list containers. It happens in Openstack TripleO CI jobs and increased recently. Error: container 994e9ba4224c10df8bfbb492092aa407f284bea2a995ddf2cf85d29218eaf7bd does not exist in database: no such container

Steps to reproduce the issue:

  1. podman container ls -q -a

And then the command fails and error appears. From journal log we can see that podman ls command happens between libpod and libpod-conmon logs. Can it be the problem that container is not fully registered, but already appears in ls?

May 04 21:00:24 standalone.localdomain systemd[1]: Started libpod-conmon-994e9ba4224c10df8bfbb492092aa407f284bea2a995ddf2cf85d29218eaf7bd.scope.
May 04 21:00:24 standalone.localdomain systemd[1]: Started libcontainer container 994e9ba4224c10df8bfbb492092aa407f284bea2a995ddf2cf85d29218eaf7bd.

// ====== here the listing command started - podman ls -q -a
May 04 21:00:25 standalone.localdomain ansible-podman_container_info[261180]: Invoked with executable=podman name=None

May 04 21:00:25 standalone.localdomain systemd[1]: libpod-994e9ba4224c10df8bfbb492092aa407f284bea2a995ddf2cf85d29218eaf7bd.scope: Succeeded.
May 04 21:00:25 standalone.localdomain systemd[1]: libpod-994e9ba4224c10df8bfbb492092aa407f284bea2a995ddf2cf85d29218eaf7bd.scope: Consumed 672ms CPU time
May 04 21:00:27 standalone.localdomain systemd[1]: libpod-conmon-994e9ba4224c10df8bfbb492092aa407f284bea2a995ddf2cf85d29218eaf7bd.scope: Succeeded.

Describe the results you expected: Container listing command lists only ready containers, not partially registered. This command should never fail in regular circumstances.

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

Version:      3.0.2-dev
API Version:  3.0.0
Go Version:   go1.16.1
Built:        Fri Mar 26 20:39:59 2021
OS/Arch:      linux/amd64

Output of podman info --debug: no --debug unfortunately

host:
  arch: amd64
  buildahVersion: 1.19.8
  cgroupManager: systemd
  cgroupVersion: v1
  conmon:
    package: conmon-2.0.26-1.module_el8.5.0+736+58cc1a5a.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.26, commit: 4fbca32554ff2f8218d059dc2f0a68ed16f32092'
  cpus: 8
  distribution:
    distribution: '"centos"'
    version: "8"
  eventLogger: file
  hostname: standalone.localdomain
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 4.18.0-301.1.el8.x86_64
  linkmode: dynamic
  memFree: 239710208
  memTotal: 8147353600
  ociRuntime:
    name: runc
    package: runc-1.0.0-70.rc92.module_el8.5.0+736+58cc1a5a.x86_64
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.2-dev'
  os: linux
  remoteSocket:
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_NET_RAW,CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    selinuxEnabled: true
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 8243326976
  swapTotal: 8589930496
  uptime: 1h 34m 6.9s (Approximately 0.04 days)
registries:
  127.0.0.1:5001:
    Blocked: false
    Insecure: true
    Location: 127.0.0.1:5001
    MirrorByDigestOnly: false
    Mirrors: null
    Prefix: 127.0.0.1:5001
  158.69.67.133:5001:
    Blocked: false
    Insecure: true
    Location: 158.69.67.133:5001
    MirrorByDigestOnly: false
    Mirrors: null
    Prefix: 158.69.67.133:5001
  192.168.24.1:8787:
    Blocked: false
    Insecure: true
    Location: 192.168.24.1:8787
    MirrorByDigestOnly: false
    Mirrors: null
    Prefix: 192.168.24.1:8787
  search:
  - registry.redhat.io
  - registry.access.redhat.com
  - registry.fedoraproject.org
  - registry.centos.org
  - docker.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 123
    paused: 0
    running: 54
    stopped: 69
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "true"
  imageStore:
    number: 41
  runRoot: /run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 3.0.0
  Built: 1616791199
  BuiltTime: Fri Mar 26 20:39:59 2021
  GitCommit: ""
  GoVersion: go1.16.1
  OsArch: linux/amd64
  Version: 3.0.2-dev

Package info (e.g. output of rpm -q podman or apt list podman):

podman-3.0.1-6.module_el8.5.0+736+58cc1a5a.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/master/troubleshooting.md)

No

Additional environment details (AWS, VirtualBox, physical, etc.):

This error happens more and more in Openstack recently.

rhatdan commented 3 years ago

I believe I have seen a fix for this in the current podman.

mheon commented 3 years ago

This one was originally fixed in the 1.x days, but I presume the rewrite for the HTTP API in 2.x caused it to reemerge. Don't know if we've merged anything since 3.0 that would improve it.

rhatdan commented 3 years ago

Well this example is not using the API, I found a race in external containers, but I don't see one in non external.

mwhahaha commented 3 years ago

I've hit this running a podman ps -a while containers are being added/removed by a different process using the cli (no api calls)

[stack@standalone-0 standalone-ansible-407lq66n]$ sudo podman ps -a | grep standalone
e7966efa68d7  quay.io/tripleomaster/openstack-horizon:current-tripleo                                           44 minutes ago  Exited (0) 44 minutes ago          standalone-0-container-puppet-horizon
4c9562041c20  quay.io/tripleomaster/openstack-iscsid:current-tripleo                                            44 minutes ago  Exited (0) 44 minutes ago          standalone-0-container-puppet-iscsid
6adaf78e9c87  quay.io/tripleomaster/openstack-keystone:current-tripleo                                          44 minutes ago  Exited (0) 44 minutes ago          standalone-0-container-puppet-keystone
04d5d76122ea  quay.io/tripleomaster/openstack-memcached:current-tripleo                                         44 minutes ago  Exited (0) 44 minutes ago          standalone-0-container-puppet-memcached
7ce1d2a92c65  quay.io/tripleomaster/openstack-mariadb:current-tripleo                                           44 minutes ago  Exited (0) 44 minutes ago          standalone-0-container-puppet-mysql
84f18d00576e  quay.io/tripleomaster/openstack-neutron-server:current-tripleo                                    44 minutes ago  Exited (0) 44 minutes ago          standalone-0-container-puppet-neutron
86d2c7dcd128  quay.io/tripleomaster/openstack-nova-api:current-tripleo                                          44 minutes ago  Exited (0) 43 minutes ago          standalone-0-container-puppet-nova
c755ad238b05  quay.io/tripleomaster/openstack-nova-compute:current-tripleo                                      44 minutes ago  Exited (0) 43 minutes ago          standalone-0-container-puppet-nova_libvirt
3cc7d12972ad  quay.io/tripleomaster/openstack-nova-api:current-tripleo                                          44 minutes ago  Exited (0) 43 minutes ago          standalone-0-container-puppet-nova_metadata
f9a6b409ca06  quay.io/tripleomaster/openstack-ovn-controller:current-tripleo                                    44 minutes ago  Exited (0) 43 minutes ago          standalone-0-container-puppet-ovn_controller
c427fc827b08  quay.io/tripleomaster/openstack-rabbitmq:current-tripleo                                          43 minutes ago  Exited (0) 43 minutes ago          standalone-0-container-puppet-rabbitmq
1526c26f7c1d  quay.io/tripleomaster/openstack-placement-api:current-tripleo                                     43 minutes ago  Exited (0) 43 minutes ago          standalone-0-container-puppet-placement
882f5dbc9f9c  quay.io/tripleomaster/openstack-redis:current-tripleo                                             43 minutes ago  Exited (0) 43 minutes ago          standalone-0-container-puppet-redis
9ac7d0cd25d5  quay.io/tripleomaster/openstack-swift-proxy-server:current-tripleo                                43 minutes ago  Exited (0) 43 minutes ago          standalone-0-container-puppet-swift
72888b417a44  quay.io/tripleomaster/openstack-swift-proxy-server:current-tripleo                                43 minutes ago  Exited (0) 43 minutes ago          standalone-0-container-puppet-swift_ringbuilder
e0677af1397d  quay.io/tripleomaster/openstack-cron:current-tripleo                                              14 seconds ago  Exited (0) 7 seconds ago           standalone-0-container-puppet-crond
40ca61b64d58  quay.io/tripleomaster/openstack-glance-api:current-tripleo                                        14 seconds ago  Up 15 seconds ago                  standalone-0-container-puppet-glance_api
e5443cfa8a59  quay.io/tripleomaster/openstack-cinder-api:current-tripleo                                        14 seconds ago  Up 14 seconds ago                  standalone-0-container-puppet-cinder
f2d8b00d7ca3  quay.io/tripleomaster/openstack-mariadb:current-tripleo                                           14 seconds ago  Exited (0) 2 seconds ago           standalone-0-container-puppet-clustercheck
9ba290f2b098  quay.io/tripleomaster/openstack-haproxy:current-tripleo                                           4 seconds ago   Up 3 seconds ago                   standalone-0-container-puppet-haproxy
[stack@standalone-0 standalone-ansible-407lq66n]$ sudo podman ps -a | grep standalone
Error: container e7966efa68d751b59469c6b1a00cd7f5474fd865e061e2e4e6e24b2e6b42dc65 does not exist in database: no such container
[stack@standalone-0 standalone-ansible-407lq66n]$ sudo podman ps -a | grep standalone
72888b417a44  quay.io/tripleomaster/openstack-swift-proxy-server:current-tripleo                                44 minutes ago      Exited (0) 44 minutes ago              standalone-0-container-puppet-swift_ringbuilder
e0677af1397d  quay.io/tripleomaster/openstack-cron:current-tripleo                                              About a minute ago  Exited (0) About a minute ago          standalone-0-container-puppet-crond
40ca61b64d58  quay.io/tripleomaster/openstack-glance-api:current-tripleo                                        About a minute ago  Exited (0) About a minute ago          standalone-0-container-puppet-glance_api
e5443cfa8a59  quay.io/tripleomaster/openstack-cinder-api:current-tripleo                                        About a minute ago  Exited (0) About a minute ago          standalone-0-container-puppet-cinder
f2d8b00d7ca3  quay.io/tripleomaster/openstack-mariadb:current-tripleo                                           About a minute ago  Exited (0) About a minute ago          standalone-0-container-puppet-clustercheck
9ba290f2b098  quay.io/tripleomaster/openstack-haproxy:current-tripleo                                           About a minute ago  Exited (0) About a minute ago          standalone-0-container-puppet-haproxy
853e20db4e2f  quay.io/tripleomaster/openstack-horizon:current-tripleo                                           About a minute ago  Exited (0) About a minute ago          standalone-0-container-puppet-horizon
ae7408b91290  quay.io/tripleomaster/openstack-iscsid:current-tripleo                                            About a minute ago  Exited (0) About a minute ago          standalone-0-container-puppet-iscsid
2241d3e6784a  quay.io/tripleomaster/openstack-keystone:current-tripleo                                          About a minute ago  Exited (0) 48 seconds ago              standalone-0-container-puppet-keystone
99d2bad91d5b  quay.io/tripleomaster/openstack-memcached:current-tripleo                                         About a minute ago  Exited (0) 58 seconds ago              standalone-0-container-puppet-memcached
5b26c3b3f7e9  quay.io/tripleomaster/openstack-mariadb:current-tripleo                                           59 seconds ago      Exited (0) 44 seconds ago              standalone-0-container-puppet-mysql
ad8e9990356f  quay.io/tripleomaster/openstack-neutron-server:current-tripleo                                    58 seconds ago      Exited (0) 34 seconds ago              standalone-0-container-puppet-neutron
8efc10786ccb  quay.io/tripleomaster/openstack-nova-api:current-tripleo                                          54 seconds ago      Exited (0) 26 seconds ago              standalone-0-container-puppet-nova
8df40b4daad2  quay.io/tripleomaster/openstack-nova-compute:current-tripleo                                      45 seconds ago      Exited (0) 15 seconds ago              standalone-0-container-puppet-nova_libvirt
2f857384ce87  quay.io/tripleomaster/openstack-nova-api:current-tripleo                                          40 seconds ago      Exited (0) 12 seconds ago              standalone-0-container-puppet-nova_metadata
93e3c164ee5d  quay.io/tripleomaster/openstack-ovn-controller:current-tripleo                                    31 seconds ago      Exited (0) 19 seconds ago              standalone-0-container-puppet-ovn_controller
a91170487a31  quay.io/tripleomaster/openstack-rabbitmq:current-tripleo                                          20 seconds ago      Up 19 seconds ago                      standalone-0-container-puppet-rabbitmq
5fb0b74f05af  quay.io/tripleomaster/openstack-placement-api:current-tripleo                                     14 seconds ago      Up 14 seconds ago                      standalone-0-container-puppet-placement
90efaec58efd  quay.io/tripleomaster/openstack-redis:current-tripleo                                             11 seconds ago      Up 11 seconds ago                      standalone-0-container-puppet-redis
652e083e530c  quay.io/tripleomaster/openstack-swift-proxy-server:current-tripleo                                9 seconds ago       Up 9 seconds ago                       standalone-0-container-puppet-swift
[stack@standalone-0 standalone-ansible-407lq66n]$ sudo podman ps -a | grep standalone
e0677af1397d  quay.io/tripleomaster/openstack-cron:current-tripleo                                              2 minutes ago       Exited (0) 2 minutes ago               standalone-0-container-puppet-crond
40ca61b64d58  quay.io/tripleomaster/openstack-glance-api:current-tripleo                                        2 minutes ago       Exited (0) About a minute ago          standalone-0-container-puppet-glance_api
e5443cfa8a59  quay.io/tripleomaster/openstack-cinder-api:current-tripleo                                        2 minutes ago       Exited (0) About a minute ago          standalone-0-container-puppet-cinder
f2d8b00d7ca3  quay.io/tripleomaster/openstack-mariadb:current-tripleo                                           2 minutes ago       Exited (0) 2 minutes ago               standalone-0-container-puppet-clustercheck
9ba290f2b098  quay.io/tripleomaster/openstack-haproxy:current-tripleo                                           2 minutes ago       Exited (0) About a minute ago          standalone-0-container-puppet-haproxy
853e20db4e2f  quay.io/tripleomaster/openstack-horizon:current-tripleo                                           2 minutes ago       Exited (0) About a minute ago          standalone-0-container-puppet-horizon
ae7408b91290  quay.io/tripleomaster/openstack-iscsid:current-tripleo                                            About a minute ago  Exited (0) About a minute ago          standalone-0-container-puppet-iscsid
2241d3e6784a  quay.io/tripleomaster/openstack-keystone:current-tripleo                                          About a minute ago  Exited (0) About a minute ago          standalone-0-container-puppet-keystone
99d2bad91d5b  quay.io/tripleomaster/openstack-memcached:current-tripleo                                         About a minute ago  Exited (0) About a minute ago          standalone-0-container-puppet-memcached
5b26c3b3f7e9  quay.io/tripleomaster/openstack-mariadb:current-tripleo                                           About a minute ago  Exited (0) About a minute ago          standalone-0-container-puppet-mysql
ad8e9990356f  quay.io/tripleomaster/openstack-neutron-server:current-tripleo                                    About a minute ago  Exited (0) About a minute ago          standalone-0-container-puppet-neutron
8efc10786ccb  quay.io/tripleomaster/openstack-nova-api:current-tripleo                                          About a minute ago  Exited (0) About a minute ago          standalone-0-container-puppet-nova
8df40b4daad2  quay.io/tripleomaster/openstack-nova-compute:current-tripleo                                      About a minute ago  Exited (0) 57 seconds ago              standalone-0-container-puppet-nova_libvirt
2f857384ce87  quay.io/tripleomaster/openstack-nova-api:current-tripleo                                          About a minute ago  Exited (0) 54 seconds ago              standalone-0-container-puppet-nova_metadata
93e3c164ee5d  quay.io/tripleomaster/openstack-ovn-controller:current-tripleo                                    About a minute ago  Exited (0) About a minute ago          standalone-0-container-puppet-ovn_controller
a91170487a31  quay.io/tripleomaster/openstack-rabbitmq:current-tripleo                                          About a minute ago  Exited (0) 42 seconds ago              standalone-0-container-puppet-rabbitmq
5fb0b74f05af  quay.io/tripleomaster/openstack-placement-api:current-tripleo                                     56 seconds ago      Exited (0) 38 seconds ago              standalone-0-container-puppet-placement
90efaec58efd  quay.io/tripleomaster/openstack-redis:current-tripleo                                             54 seconds ago      Exited (0) 42 seconds ago              standalone-0-container-puppet-redis
652e083e530c  quay.io/tripleomaster/openstack-swift-proxy-server:current-tripleo                                51 seconds ago      Exited (0) 37 seconds ago              standalone-0-container-puppet-swift
3365c86d7cf3  quay.io/tripleomaster/openstack-swift-proxy-server:current-tripleo                                40 seconds ago      Exited (0) 20 seconds ago              standalone-0-container-puppet-swift_ringbuilder
[stack@standalone-0 standalone-ansible-407lq66n]$ podman --version
podman version 3.0.2-dev
sshnaidm commented 3 years ago

It's very easily reproducing now. Just create a bunch of containers and run podman container rm .. in a bash loop. In a other terminal run consequently podman ps. You can see:

DEBU[0000] Using conmon: "/usr/bin/conmon"              
DEBU[0000] Initializing boltdb state at /home/sshnaidm/.local/share/containers/storage/libpod/bolt_state.db 
DEBU[0000] Overriding run root "/run/user/1000" with "/run/user/1000/containers" from database 
DEBU[0000] Using graph driver overlay                   
DEBU[0000] Using graph root /home/sshnaidm/.local/share/containers/storage 
DEBU[0000] Using run root /run/user/1000/containers     
DEBU[0000] Using static dir /home/sshnaidm/.local/share/containers/storage/libpod 
DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp      
DEBU[0000] Using volume path /home/sshnaidm/.local/share/containers/storage/volumes 
DEBU[0000] overlay storage already configured with a mount-program 
DEBU[0000] Set libpod namespace to ""                   
DEBU[0000] [graphdriver] trying provided driver "overlay" 
DEBU[0000] overlay: mount_program=/usr/bin/fuse-overlayfs 
DEBU[0000] overlay: mount_program=/usr/bin/fuse-overlayfs 
DEBU[0000] backingFs=btrfs, projectQuotaSupported=false, useNativeDiff=false, usingMetacopy=false 
DEBU[0000] Initializing event backend journald          
DEBU[0000] using runtime "/usr/bin/crun"                
DEBU[0000] using runtime "/usr/bin/runc"                
INFO[0000] Error initializing configured OCI runtime kata: no valid executable found for OCI runtime kata: invalid argument 
INFO[0000] Setting parallel job count to 37             
Error: pod 165d470fcde61c693529019720041a350074a0d7acc8c45246aeabb03f077049 not found in database: no such pod
github-actions[bot] commented 3 years ago

A friendly reminder that this issue had no activity for 30 days.

mheon commented 3 years ago

Did we fix this one? I swear I saw a commit related to it, but I don't see it in the release notes for 3.2.0.

rhatdan commented 3 years ago

Ok let's assume this is fixed. Reopen if we are mistaken.