containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.75k stars 2.42k forks source link

nova_libvirt: podman panic issue #12717

Closed vikram077 closed 2 years ago

vikram077 commented 2 years ago

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

Steps to reproduce the issue:

  1. Create a VM in RHOSP 16.2.
  2. Login to the same compute node in which VM is launched
  3. Start/Stop nova_libvirt container.
  4. Issue arise after repeating step-3 3-4 times.

NOTE: Issue only occurs when VM is in Active state in compute node, else there is no issue. Issue is also reproducible after podman update.

Describe the results you received:

Crash logs

[root@overcloud-computesriov-1 ~]# podman ps -a|grep libvirt
937cc4f9cb99  manager.ctlplane.example.com:8787/rhosp-rhel8/openstack-nova-libvirt:16.2                kolla_start           1                          0 days ago  Up 2 hours ago                  nova_virtlogd
ee92812556b6  manager.ctlplane.example.com:8787/rhosp-rhel8/openstack-nova-libvirt:16.2                kolla_start           1                          0 days ago  Up 2 hours ago                  nova_libvirt
9d9488b51407  manager.ctlplane.example.com:8787/rhosp-rhel8/openstack-nova-libvirt:16.2                /bin/bash -c /usr...  1                          0 days ago  Exited (0) 10 days ago          nova_libvirt_init_secret
[root@overcloud-computesriov-1 ~]#
[root@overcloud-computesriov-1 ~]#
[root@overcloud-computesriov-1 ~]#
[root@overcloud-computesriov-1 ~]# podman stop nova_libvirt
ERRO[0000] Failed to remove paths: map[hugetlb:/sys/fs/cgroup/hugetlb/machine.slice/libpod-ee92812556b615b7624bc371b5cf6484a71                          fc79a32e0d823738593aacdfa37bb.scope name=systemd:/sys/fs/cgroup/systemd/machine.slice/libpod-ee92812556b615b7624bc371b5cf6484a                          71fc79a32e0d823738593aacdfa37bb.scope pids:/sys/fs/cgroup/pids/machine.slice/libpod-ee92812556b615b7624bc371b5cf6484a71fc79a32                          e0d823738593aacdfa37bb.scope]
ee92812556b615b7624bc371b5cf6484a71fc79a32e0d823738593aacdfa37bb
[root@overcloud-computesriov-1 ~]#
[root@overcloud-computesriov-1 ~]#
[root@overcloud-computesriov-1 ~]# podman ps -a|grep libvirt
937cc4f9cb99  manager.ctlplane.example.com:8787/rhosp-rhel8/openstack-nova-libvirt:16.2                kolla_start           1                          0 days ago  Up 2 hours ago                  nova_virtlogd
ee92812556b6  manager.ctlplane.example.com:8787/rhosp-rhel8/openstack-nova-libvirt:16.2                kolla_start           1                          0 days ago  Up 5 seconds ago                nova_libvirt
9d9488b51407  manager.ctlplane.example.com:8787/rhosp-rhel8/openstack-nova-libvirt:16.2                /bin/bash -c /usr...  1                          0 days ago  Exited (0) 10 days ago          nova_libvirt_init_secret
[root@overcloud-computesriov-1 ~]#
[root@overcloud-computesriov-1 ~]#
[root@overcloud-computesriov-1 ~]# podman ps -a|grep libvirt
937cc4f9cb99  manager.ctlplane.example.com:8787/rhosp-rhel8/openstack-nova-libvirt:16.2                kolla_start           10 days ago  Up 2 hours ago                  nova_virtlogd
ee92812556b6  manager.ctlplane.example.com:8787/rhosp-rhel8/openstack-nova-libvirt:16.2                kolla_start           10 days ago  Up 11 seconds ago               nova_libvirt
9d9488b51407  manager.ctlplane.example.com:8787/rhosp-rhel8/openstack-nova-libvirt:16.2                /bin/bash -c /usr...  10 days ago  Exited (0) 10 days ago          nova_libvirt_init_secret
[root@overcloud-computesriov-1 ~]#
[root@overcloud-computesriov-1 ~]#
[root@overcloud-computesriov-1 ~]#
[root@overcloud-computesriov-1 ~]# podman stop nova_libvirt
ee92812556b615b7624bc371b5cf6484a71fc79a32e0d823738593aacdfa37bb
panic: operation not permitted

goroutine 88 [running]:
panic(0x560861500ba0, 0xc00029d9e0)
        /usr/lib/golang/src/runtime/panic.go:1064 +0x545 fp=0xc0001a5d78 sp=0xc0001a5cb0 pc=0x56085fc7eda5
github.com/containers/podman/libpod/lock.(*SHMLock).Unlock(0xc0003cc690)
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/libpod/lock/shm_lock_manager_linux.go:121 +0x8f fp=0xc0001a5da8 sp=0xc0001a5d78 pc=0x56086088310f
github.com/containers/podman/libpod.(*Container).StopWithTimeout(0xc00061a780, 0xa, 0x5608617d1fe0, 0xc00026ffe0)
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/libpod/container_api.go:200 +0x3f8 fp=0xc0001a5e58 sp=0xc0001a5da8 pc=0x560860bcf8f8
github.com/containers/podman/libpod.(*Container).Stop(...)
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/libpod/container_api.go:176
github.com/containers/podman/pkg/domain/infra/abi.(*ContainerEngine).ContainerStop.func1(0xc00061a780, 0x0, 0x0)
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/pkg/domain/infra/abi/containers.go:150 +0x511 fp=0xc0001a5ef8 sp=0xc0001a5e58 pc=0x560860dc09f1
github.com/containers/podman/pkg/parallel/ctr.ContainerOp.func1(0xc000146780, 0x56086180e9a0)
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/pkg/parallel/ctr/ctr.go:28 +0x30 fp=0xc0001a5f20 sp=0xc0001a5ef8 pc=0x560860d23c70
github.com/containers/podman/pkg/parallel.Enqueue.func1(0xc0000d23c0, 0x56086180e9a0, 0xc000130020, 0xc0000311e0)
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/pkg/parallel/parallel.go:66 +0x198 fp=0xc0001a5fc0 sp=0xc0001a5f20 pc=0x560860bae198
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc0001a5fc8 sp=0xc0001a5fc0 pc=0x56085fcb66c1
created by github.com/containers/podman/pkg/parallel.Enqueue
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/pkg/parallel/parallel.go:55 +0x78

goroutine 1 [runnable]:
github.com/containers/podman/cmd/podman/registry.Context(0x5608622ccb60, 0x560860f94cba)
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/cmd/podman/registry/registry.go:84 +0xfb fp=0xc00088bc78 sp=0xc00088bc70 pc=0x560860e1725b
main.persistentPostRunE(0x5608622ccb60, 0xc0004b4560, 0x1, 0x1, 0x0, 0x0)
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/cmd/podman/root.go:247 +0x1c5 fp=0xc00088bd28 sp=0xc00088bc78 pc=0x560860f862a5
github.com/containers/podman/vendor/github.com/spf13/cobra.(*Command).execute(0x5608622ccb60, 0xc00011a170, 0x1, 0x1, 0x5608622ccb60, 0xc00011a170)
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/vendor/github.com/spf13/cobra/command.go:865 +0x382 fp=0xc00088be00 sp=0xc00088bd28 pc=0x5608602cc7a2
github.com/containers/podman/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x5608622deec0, 0xc000130020, 0x560861589fe0, 0x56086238e520)
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/vendor/github.com/spf13/cobra/command.go:958 +0x375 fp=0xc00088bed8 sp=0xc00088be00 pc=0x5608602cd415
github.com/containers/podman/vendor/github.com/spf13/cobra.(*Command).Execute(...)
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/vendor/github.com/spf13/cobra/command.go:895
github.com/containers/podman/vendor/github.com/spf13/cobra.(*Command).ExecuteContext(...)
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/vendor/github.com/spf13/cobra/command.go:888
main.Execute()
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/cmd/podman/root.go:92 +0xee fp=0xc00088bf48 sp=0xc00088bed8 pc=0x560860f84fee
main.main()
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/cmd/podman/main.go:36 +0x94 fp=0xc00088bf88 sp=0xc00088bf48 pc=0x560860f84874
runtime.main()
        /usr/lib/golang/src/runtime/proc.go:204 +0x209 fp=0xc00088bfe0 sp=0xc00088bf88 pc=0x56085fc81a89
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc00088bfe8 sp=0xc00088bfe0 pc=0x56085fcb66c1

goroutine 2 [force gc (idle)]:
runtime.gopark(0x5608617a3550, 0x560862356550, 0x1411, 0x1)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc000078fb0 sp=0xc000078f90 pc=0x56085fc81e86
runtime.goparkunlock(...)
        /usr/lib/golang/src/runtime/proc.go:312
runtime.forcegchelper()
        /usr/lib/golang/src/runtime/proc.go:255 +0xc5 fp=0xc000078fe0 sp=0xc000078fb0 pc=0x56085fc81d25
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc000078fe8 sp=0xc000078fe0 pc=0x56085fcb66c1
created by runtime.init.7
        /usr/lib/golang/src/runtime/proc.go:243 +0x37

goroutine 3 [GC sweep wait]:
runtime.gopark(0x5608617a3550, 0x560862356fc0, 0x140c, 0x1)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc0000797a8 sp=0xc000079788 pc=0x56085fc81e86
runtime.goparkunlock(...)
        /usr/lib/golang/src/runtime/proc.go:312
runtime.bgsweep(0xc0000a0000)
        /usr/lib/golang/src/runtime/mgcsweep.go:182 +0x145 fp=0xc0000797d8 sp=0xc0000797a8 pc=0x56085fc6d2c5
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc0000797e0 sp=0xc0000797d8 pc=0x56085fcb66c1
created by runtime.gcenable
        /usr/lib/golang/src/runtime/mgc.go:217 +0x5e

goroutine 4 [GC scavenge wait]:
runtime.gopark(0x5608617a3550, 0x560862357a60, 0x140d, 0x1)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc000079f78 sp=0xc000079f58 pc=0x56085fc81e86
runtime.goparkunlock(...)
        /usr/lib/golang/src/runtime/proc.go:312
runtime.bgscavenge(0xc0000a0000)
        /usr/lib/golang/src/runtime/mgcscavenge.go:314 +0x2a5 fp=0xc000079fd8 sp=0xc000079f78 pc=0x56085fc6b425
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc000079fe0 sp=0xc000079fd8 pc=0x56085fcb66c1
created by runtime.gcenable
        /usr/lib/golang/src/runtime/mgc.go:218 +0x85

goroutine 18 [finalizer wait]:
runtime.gopark(0x5608617a3550, 0x56086238e438, 0xc000791410, 0x1)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc000078758 sp=0xc000078738 pc=0x56085fc81e86
runtime.goparkunlock(...)
        /usr/lib/golang/src/runtime/proc.go:312
runtime.runfinq()
        /usr/lib/golang/src/runtime/mfinal.go:175 +0xab fp=0xc0000787e0 sp=0xc000078758 pc=0x56085fc61eab
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc0000787e8 sp=0xc0000787e0 pc=0x56085fcb66c1
created by runtime.createfing
        /usr/lib/golang/src/runtime/mfinal.go:156 +0x66

goroutine 19 [GC worker (idle)]:
runtime.gopark(0x5608617a33d8, 0xc0000420b0, 0x1418, 0x0)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc000074760 sp=0xc000074740 pc=0x56085fc81e86
runtime.gcBgMarkWorker(0xc00004c000)
        /usr/lib/golang/src/runtime/mgc.go:1891 +0x105 fp=0xc0000747d8 sp=0xc000074760 pc=0x56085fc65be5
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc0000747e0 sp=0xc0000747d8 pc=0x56085fcb66c1
created by runtime.gcBgMarkStartWorkers
        /usr/lib/golang/src/runtime/mgc.go:1839 +0x79

goroutine 5 [GC worker (idle)]:
runtime.gopark(0x5608617a33d8, 0xc000500000, 0x1418, 0x0)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc00007a760 sp=0xc00007a740 pc=0x56085fc81e86
runtime.gcBgMarkWorker(0xc00004e800)
        /usr/lib/golang/src/runtime/mgc.go:1891 +0x105 fp=0xc00007a7d8 sp=0xc00007a760 pc=0x56085fc65be5
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc00007a7e0 sp=0xc00007a7d8 pc=0x56085fcb66c1
created by runtime.gcBgMarkStartWorkers
        /usr/lib/golang/src/runtime/mgc.go:1839 +0x79

goroutine 20 [GC worker (idle)]:
runtime.gopark(0x5608617a33d8, 0xc000500010, 0x1418, 0x0)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc000074f60 sp=0xc000074f40 pc=0x56085fc81e86
runtime.gcBgMarkWorker(0xc000051000)
        /usr/lib/golang/src/runtime/mgc.go:1891 +0x105 fp=0xc000074fd8 sp=0xc000074f60 pc=0x56085fc65be5
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc000074fe0 sp=0xc000074fd8 pc=0x56085fcb66c1
created by runtime.gcBgMarkStartWorkers
        /usr/lib/golang/src/runtime/mgc.go:1839 +0x79

goroutine 6 [GC worker (idle)]:
runtime.gopark(0x5608617a33d8, 0xc000500020, 0x1418, 0x0)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc00007af60 sp=0xc00007af40 pc=0x56085fc81e86
runtime.gcBgMarkWorker(0xc000053800)
        /usr/lib/golang/src/runtime/mgc.go:1891 +0x105 fp=0xc00007afd8 sp=0xc00007af60 pc=0x56085fc65be5
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc00007afe0 sp=0xc00007afd8 pc=0x56085fcb66c1
created by runtime.gcBgMarkStartWorkers
        /usr/lib/golang/src/runtime/mgc.go:1839 +0x79

goroutine 21 [GC worker (idle)]:
runtime.gopark(0x5608617a33d8, 0xc000500030, 0x1418, 0x0)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc000075760 sp=0xc000075740 pc=0x56085fc81e86
runtime.gcBgMarkWorker(0xc000056000)
        /usr/lib/golang/src/runtime/mgc.go:1891 +0x105 fp=0xc0000757d8 sp=0xc000075760 pc=0x56085fc65be5
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc0000757e0 sp=0xc0000757d8 pc=0x56085fcb66c1
created by runtime.gcBgMarkStartWorkers
        /usr/lib/golang/src/runtime/mgc.go:1839 +0x79

goroutine 7 [GC worker (idle)]:
runtime.gopark(0x5608617a33d8, 0xc000500040, 0x1418, 0x0)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc00007b760 sp=0xc00007b740 pc=0x56085fc81e86
runtime.gcBgMarkWorker(0xc000058800)
        /usr/lib/golang/src/runtime/mgc.go:1891 +0x105 fp=0xc00007b7d8 sp=0xc00007b760 pc=0x56085fc65be5
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc00007b7e0 sp=0xc00007b7d8 pc=0x56085fcb66c1
created by runtime.gcBgMarkStartWorkers
        /usr/lib/golang/src/runtime/mgc.go:1839 +0x79

goroutine 22 [GC worker (idle)]:
runtime.gopark(0x5608617a33d8, 0xc000500050, 0x1418, 0x0)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc000075f60 sp=0xc000075f40 pc=0x56085fc81e86
runtime.gcBgMarkWorker(0xc00005b000)
        /usr/lib/golang/src/runtime/mgc.go:1891 +0x105 fp=0xc000075fd8 sp=0xc000075f60 pc=0x56085fc65be5
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc000075fe0 sp=0xc000075fd8 pc=0x56085fcb66c1
created by runtime.gcBgMarkStartWorkers
        /usr/lib/golang/src/runtime/mgc.go:1839 +0x79

goroutine 8 [GC worker (idle)]:
runtime.gopark(0x5608617a33d8, 0xc000500060, 0x1418, 0x0)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc00007bf60 sp=0xc00007bf40 pc=0x56085fc81e86
runtime.gcBgMarkWorker(0xc00005d800)
        /usr/lib/golang/src/runtime/mgc.go:1891 +0x105 fp=0xc00007bfd8 sp=0xc00007bf60 pc=0x56085fc65be5
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc00007bfe0 sp=0xc00007bfd8 pc=0x56085fcb66c1
created by runtime.gcBgMarkStartWorkers
        /usr/lib/golang/src/runtime/mgc.go:1839 +0x79

goroutine 34 [GC worker (idle)]:
runtime.gopark(0x5608617a33d8, 0xc0004b26f0, 0x1418, 0x0)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc000508760 sp=0xc000508740 pc=0x56085fc81e86
runtime.gcBgMarkWorker(0xc000060000)
        /usr/lib/golang/src/runtime/mgc.go:1891 +0x105 fp=0xc0005087d8 sp=0xc000508760 pc=0x56085fc65be5
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc0005087e0 sp=0xc0005087d8 pc=0x56085fcb66c1
created by runtime.gcBgMarkStartWorkers
        /usr/lib/golang/src/runtime/mgc.go:1839 +0x79

goroutine 23 [GC worker (idle)]:
runtime.gopark(0x5608617a33d8, 0xc000500070, 0x1418, 0x0)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc000076760 sp=0xc000076740 pc=0x56085fc81e86
runtime.gcBgMarkWorker(0xc000062800)
        /usr/lib/golang/src/runtime/mgc.go:1891 +0x105 fp=0xc0000767d8 sp=0xc000076760 pc=0x56085fc65be5
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc0000767e0 sp=0xc0000767d8 pc=0x56085fcb66c1
created by runtime.gcBgMarkStartWorkers
        /usr/lib/golang/src/runtime/mgc.go:1839 +0x79

goroutine 9 [GC worker (idle)]:
runtime.gopark(0x5608617a33d8, 0xc000500080, 0x1418, 0x0)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc000504760 sp=0xc000504740 pc=0x56085fc81e86
runtime.gcBgMarkWorker(0xc000065000)
        /usr/lib/golang/src/runtime/mgc.go:1891 +0x105 fp=0xc0005047d8 sp=0xc000504760 pc=0x56085fc65be5
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc0005047e0 sp=0xc0005047d8 pc=0x56085fcb66c1
created by runtime.gcBgMarkStartWorkers
        /usr/lib/golang/src/runtime/mgc.go:1839 +0x79

goroutine 24 [GC worker (idle)]:
runtime.gopark(0x5608617a33d8, 0xc000500090, 0x1418, 0x0)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc000076f60 sp=0xc000076f40 pc=0x56085fc81e86
runtime.gcBgMarkWorker(0xc000067800)
        /usr/lib/golang/src/runtime/mgc.go:1891 +0x105 fp=0xc000076fd8 sp=0xc000076f60 pc=0x56085fc65be5
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc000076fe0 sp=0xc000076fd8 pc=0x56085fcb66c1
created by runtime.gcBgMarkStartWorkers
        /usr/lib/golang/src/runtime/mgc.go:1839 +0x79

goroutine 10 [GC worker (idle)]:
runtime.gopark(0x5608617a33d8, 0xc0005000a0, 0x1418, 0x0)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc000504f60 sp=0xc000504f40 pc=0x56085fc81e86
runtime.gcBgMarkWorker(0xc00006a000)
        /usr/lib/golang/src/runtime/mgc.go:1891 +0x105 fp=0xc000504fd8 sp=0xc000504f60 pc=0x56085fc65be5
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc000504fe0 sp=0xc000504fd8 pc=0x56085fcb66c1
created by runtime.gcBgMarkStartWorkers
        /usr/lib/golang/src/runtime/mgc.go:1839 +0x79

goroutine 25 [GC worker (idle)]:
runtime.gopark(0x5608617a33d8, 0xc0005000b0, 0x1418, 0x0)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc000077760 sp=0xc000077740 pc=0x56085fc81e86
runtime.gcBgMarkWorker(0xc00006c800)
        /usr/lib/golang/src/runtime/mgc.go:1891 +0x105 fp=0xc0000777d8 sp=0xc000077760 pc=0x56085fc65be5
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc0000777e0 sp=0xc0000777d8 pc=0x56085fcb66c1
created by runtime.gcBgMarkStartWorkers
        /usr/lib/golang/src/runtime/mgc.go:1839 +0x79

goroutine 11 [GC worker (idle)]:
runtime.gopark(0x5608617a33d8, 0xc0005000c0, 0x1418, 0x0)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc000505760 sp=0xc000505740 pc=0x56085fc81e86
runtime.gcBgMarkWorker(0xc00006f000)
        /usr/lib/golang/src/runtime/mgc.go:1891 +0x105 fp=0xc0005057d8 sp=0xc000505760 pc=0x56085fc65be5
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc0005057e0 sp=0xc0005057d8 pc=0x56085fcb66c1
created by runtime.gcBgMarkStartWorkers
        /usr/lib/golang/src/runtime/mgc.go:1839 +0x79

goroutine 26 [GC worker (idle)]:
runtime.gopark(0x5608617a33d8, 0xc0005000d0, 0x1418, 0x0)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc000077f60 sp=0xc000077f40 pc=0x56085fc81e86
runtime.gcBgMarkWorker(0xc000071800)
        /usr/lib/golang/src/runtime/mgc.go:1891 +0x105 fp=0xc000077fd8 sp=0xc000077f60 pc=0x56085fc65be5
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc000077fe0 sp=0xc000077fd8 pc=0x56085fcb66c1
created by runtime.gcBgMarkStartWorkers
        /usr/lib/golang/src/runtime/mgc.go:1839 +0x79

goroutine 66 [select, locked to thread]:
runtime.gopark(0x5608617a35a0, 0x0, 0x1809, 0x1)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc00059e608 sp=0xc00059e5e8 pc=0x56085fc81e86
runtime.selectgo(0xc00059e778, 0xc00059e770, 0x2, 0x8, 0x560860f93001)
        /usr/lib/golang/src/runtime/select.go:338 +0xcef fp=0xc00059e730 sp=0xc00059e608 pc=0x56085fc921ef
runtime.ensureSigM.func1()
        /usr/lib/golang/src/runtime/signal_unix.go:897 +0x1fa fp=0xc00059e7e0 sp=0xc00059e730 pc=0x56085fcaeafa
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc00059e7e8 sp=0xc00059e7e0 pc=0x56085fcb66c1
created by runtime.ensureSigM
        /usr/lib/golang/src/runtime/signal_unix.go:880 +0xd7

goroutine 68 [select]:
runtime.gopark(0x5608617a35a0, 0x0, 0x1809, 0x1)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc000088d68 sp=0xc000088d48 pc=0x56085fc81e86
runtime.selectgo(0xc000088f80, 0xc000088ee0, 0x2, 0x3, 0x0)
        /usr/lib/golang/src/runtime/select.go:338 +0xcef fp=0xc000088e90 sp=0xc000088d68 pc=0x56085fc921ef
github.com/containers/podman/libpod/shutdown.Start.func1()
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/libpod/shutdown/handler.go:45 +0xcd fp=0xc000088fe0 sp=0xc000088e90 pc=0x560860bb478d
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc000088fe8 sp=0xc000088fe0 pc=0x56085fcb66c1
created by github.com/containers/podman/libpod/shutdown.Start
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/libpod/shutdown/handler.go:44 +0x116

goroutine 67 [syscall]:
runtime.notetsleepg(0x56086238f060, 0xffffffffffffffff, 0xc00059a7c8)
        /usr/lib/golang/src/runtime/lock_futex.go:235 +0x38 fp=0xc00059a798 sp=0xc00059a768 pc=0x56085fc54af8
os/signal.signal_recv(0x0)
        /usr/lib/golang/src/runtime/sigqueue.go:147 +0x9e fp=0xc00059a7c0 sp=0xc00059a798 pc=0x56085fcb2dbe
os/signal.loop()
        /usr/lib/golang/src/os/signal/signal_unix.go:23 +0x25 fp=0xc00059a7e0 sp=0xc00059a7c0 pc=0x56086044ef05
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc00059a7e8 sp=0xc00059a7e0 pc=0x56085fcb66c1
created by os/signal.Notify.func1.1
        /usr/lib/golang/src/os/signal/signal.go:150 +0x46

goroutine 56 [chan receive]:
runtime.gopark(0x5608617a3338, 0xc0000a5978, 0x170e, 0x2)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc000505ed0 sp=0xc000505eb0 pc=0x56085fc81e86
runtime.chanrecv(0xc0000a5920, 0xc000505fb0, 0xc0000ae001, 0x560862359cf8)
        /usr/lib/golang/src/runtime/chan.go:577 +0x36f fp=0xc000505f60 sp=0xc000505ed0 pc=0x56085fc4edef
runtime.chanrecv2(0xc0000a5920, 0xc000505fb0, 0x1)
        /usr/lib/golang/src/runtime/chan.go:444 +0x2b fp=0xc000505f90 sp=0xc000505f60 pc=0x56085fc4ea6b
github.com/containers/podman/vendor/k8s.io/klog/v2.(*loggingT).flushDaemon(0x560862359ce0)
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/vendor/k8s.io/klog/v2/klog.go:1169 +0x8d fp=0xc000505fd8 sp=0xc000505f90 pc=0x56086088702d
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc000505fe0 sp=0xc000505fd8 pc=0x56085fcb66c1
created by github.com/containers/podman/vendor/k8s.io/klog/v2.init.0
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/vendor/k8s.io/klog/v2/klog.go:417 +0xdf

goroutine 57 [chan receive]:
runtime.gopark(0x5608617a3338, 0xc000693918, 0xc00068170e, 0x2)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc0005066d0 sp=0xc0005066b0 pc=0x56085fc81e86
runtime.chanrecv(0xc0006938c0, 0xc0005067b0, 0xc0006a0101, 0x560862359b18)
        /usr/lib/golang/src/runtime/chan.go:577 +0x36f fp=0xc000506760 sp=0xc0005066d0 pc=0x56085fc4edef
runtime.chanrecv2(0xc0006938c0, 0xc0005067b0, 0x1)
        /usr/lib/golang/src/runtime/chan.go:444 +0x2b fp=0xc000506790 sp=0xc000506760 pc=0x56085fc4ea6b
github.com/containers/podman/vendor/k8s.io/klog.(*loggingT).flushDaemon(0x560862359b00)
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/vendor/k8s.io/klog/klog.go:1010 +0x8d fp=0xc0005067d8 sp=0xc000506790 pc=0x56086088d86d
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc0005067e0 sp=0xc0005067d8 pc=0x56085fcb66c1
created by github.com/containers/podman/vendor/k8s.io/klog.init.0
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/vendor/k8s.io/klog/klog.go:411 +0xd8

goroutine 86 [syscall]:
syscall.Syscall6(0xe8, 0xa, 0xc00098fb6c, 0x7, 0xffffffffffffffff, 0x0, 0x0, 0x0, 0x0, 0x0)
        /usr/lib/golang/src/syscall/asm_linux_amd64.s:41 +0x5 fp=0xc00098faa0 sp=0xc00098fa98 pc=0x56085fd0c625
github.com/containers/podman/vendor/golang.org/x/sys/unix.EpollWait(0xa, 0xc00098fb6c, 0x7, 0x7, 0xffffffffffffffff, 0x0, 0x0, 0x0)
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/vendor/golang.org/x/sys/unix/zsyscall_linux_amd64.go:76 +0x74 fp=0xc00098fb10 sp=0xc00098faa0 pc=0x5608600060d4
github.com/containers/podman/vendor/github.com/fsnotify/fsnotify.(*fdPoller).wait(0xc000044ec0, 0x0, 0x0, 0x0)
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/vendor/github.com/fsnotify/fsnotify/inotify_poller.go:86 +0x93 fp=0xc00098fbd8 sp=0xc00098fb10 pc=0x560860428433
github.com/containers/podman/vendor/github.com/fsnotify/fsnotify.(*Watcher).readEvents(0xc0001466e0)
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/vendor/github.com/fsnotify/fsnotify/inotify.go:192 +0x206 fp=0xc00099ffd8 sp=0xc00098fbd8 pc=0x5608604275a6
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc00099ffe0 sp=0xc00099ffd8 pc=0x56085fcb66c1
created by github.com/containers/podman/vendor/github.com/fsnotify/fsnotify.NewWatcher
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/vendor/github.com/fsnotify/fsnotify/inotify.go:59 +0x1a8

goroutine 87 [select]:
runtime.gopark(0x5608617a35a0, 0x0, 0x1809, 0x1)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc00083ad40 sp=0xc00083ad20 pc=0x56085fc81e86
runtime.selectgo(0xc00083af38, 0xc00083aeb4, 0x3, 0x0, 0x0)
        /usr/lib/golang/src/runtime/select.go:338 +0xcef fp=0xc00083ae68 sp=0xc00083ad40 pc=0x56085fc921ef
github.com/containers/podman/vendor/github.com/cri-o/ocicni/pkg/ocicni.(*cniNetworkPlugin).monitorConfDir(0xc0005ec0c0, 0xc0006a7c30)
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/vendor/github.com/cri-o/ocicni/pkg/ocicni/ocicni.go:150 +0x1a5 fp=0xc00083afd0 sp=0xc00083ae68 pc=0x56086043bfe5
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc00083afd8 sp=0xc00083afd0 pc=0x56085fcb66c1
created by github.com/containers/podman/vendor/github.com/cri-o/ocicni/pkg/ocicni.initCNI
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/vendor/github.com/cri-o/ocicni/pkg/ocicni/ocicni.go:250 +0x3b1

goroutine 98 [sleep]:
runtime.gopark(0x5608617a3588, 0xc0000ae140, 0x1313, 0x1)
        /usr/lib/golang/src/runtime/proc.go:306 +0xe6 fp=0xc000835710 sp=0xc0008356f0 pc=0x56085fc81e86
time.Sleep(0x5f5e100)
        /usr/lib/golang/src/runtime/time.go:188 +0xbf fp=0xc000835750 sp=0xc000835710 pc=0x56085fcb367f
github.com/containers/podman/libpod.pidwaitPidStop.func1(0xc0006ae0c0, 0x9fdc, 0xc0006ae060)
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/libpod/oci_conmon_linux.go:959 +0x32 fp=0xc0008357c8 sp=0xc000835750 pc=0x560860cba852
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc0008357d0 sp=0xc0008357c8 pc=0x56085fcb66c1
created by github.com/containers/podman/libpod.waitPidStop
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/libpod/oci_conmon_linux.go:946 +0xa6
Aborted (core dumped)
[root@overcloud-computesriov-1 ~]#

Describe the results you expected:

Podman should not be crashed and work properly

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

[root@overcloud-computesriov-1 ~]# podman version
Version:      3.0.2-dev
API Version:  3.0.0
Go Version:   go1.15.7
Built:        Wed Apr  7 13:07:54 2021
OS/Arch:      linux/amd64
[root@overcloud-computesriov-1 ~]#

Output of podman info --debug:

[root@overcloud-computesriov-1 ~]# podman info --debug
host:
  arch: amd64
  buildahVersion: 1.19.8
  cgroupManager: systemd
  cgroupVersion: v1
  conmon:
    package: conmon-2.0.26-1.module+el8.4.0+11310+8c67a752.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.26, commit: 7d8f1d32c3e8065fe9811115184eee15715a3e78'
  cpus: 16
  distribution:
    distribution: '"rhel"'
    version: "8.4"
  eventLogger: journald
  hostname: overcloud-computesriov-0
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 4.18.0-305.12.1.el8_4.x86_64
  linkmode: dynamic
  memFree: 368571539456
  memTotal: 405380489216
  ociRuntime:
    name: runc
    package: runc-1.0.0-71.rc92.module+el8.4.0+11310+8c67a752.x86_64
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.2-dev'
  os: linux
  remoteSocket:
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_NET_RAW,CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    selinuxEnabled: true
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 0
  swapTotal: 0
  uptime: 15m 7.79s
registries:
  manager.ctlplane.example.com:8787:
    Blocked: false
    Insecure: true
    Location: manager.ctlplane.example.com:8787
    MirrorByDigestOnly: false
    Mirrors: []
    Prefix: manager.ctlplane.example.com:8787
  search:
  - registry.access.redhat.com
  - registry.redhat.io
  - docker.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 17
    paused: 0
    running: 9
    stopped: 8
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "true"
  imageStore:
    number: 9
  runRoot: /run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 3.0.0
  Built: 1617800874
  BuiltTime: Wed Apr  7 13:07:54 2021
  GitCommit: ""
  GoVersion: go1.15.7
  OsArch: linux/amd64
  Version: 3.0.2-dev

Package info (e.g. output of rpm -q podman or apt list podman):

[root@overcloud-computesriov-1 ~]# rpm -q podman
podman-3.0.1-6.module+el8.4.0+10614+dd38312c.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/master/troubleshooting.md)

Yes and occuring same.

Additional environment details (AWS, VirtualBox, physical, etc.):

RHOSP 16.2, baremetal environment

giuseppe commented 2 years ago

what other processes are running on the host?

How many CPUs cores are available?

@mheon could it be different podman installations (different graphroots) stepping on each other shm locks?

mheon commented 2 years ago

I don't think so - it shouldn't be possible for two Libpods to allocate the same lock.

More likely, IMO, is we somehow ended up with more than one container/pod/volume using the same lock (probably 0). Does a podman system renumber help?

vikram077 commented 2 years ago

what other processes are running on the host?

How many CPUs cores are available?

@mheon could it be different podman installations (different graphroots) stepping on each other shm locks?

How many CPUs cores are available?

[root@overcloud-computesriov-1 ~]# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              88
On-line CPU(s) list: 0-87
Thread(s) per core:  2
Core(s) per socket:  22
Socket(s):           2
NUMA node(s):        2
Vendor ID:           GenuineIntel
BIOS Vendor ID:      Intel
CPU family:          6
Model:               79
Model name:          Intel(R) Xeon(R) CPU E5-2696 v4 @ 2.20GHz
BIOS Model name:     Intel(R) Xeon(R) CPU E5-2696 v4 @ 2.20GHz
Stepping:            1
CPU MHz:             2799.743
BogoMIPS:            4399.93
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            56320K
NUMA node0 CPU(s):   0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78,80,82,84,86
NUMA node1 CPU(s):   1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79,81,83,85,87
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx smap intel_pt xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts flush_l1d
[root@overcloud-computesriov-1 ~]#

Memory

[root@overcloud-computesriov-1 ~]# free -h
              total        used        free      shared  buff/cache   available
Mem:          377Gi        33Gi       343Gi       8.0Mi       697Mi       342Gi
Swap:            0B          0B          0B
[root@overcloud-computesriov-1 ~]#

what other processes are running on the host?

It is fresh deployment of RHOSP 16.2, crash only happens when qemu is running in the node else everything works as expected.


Also please check my few findings

We have also observed below logs in our system

[root@overcloud-computesriov-1 ~]# grep panic /var/log/messages
Dec 27 09:59:59 overcloud-computesriov-1 podman[667957]: panic: operation not permitted
Dec 27 09:59:59 overcloud-computesriov-1 podman[667957]: panic(0x56227278c460, 0xc00053b0f0)
Dec 27 09:59:59 overcloud-computesriov-1 podman[667957]: #011/usr/lib/golang/src/runtime/panic.go:1064 +0x545 fp=0xc0003e3d78 sp=0xc0003e3cb0 pc=0x562270efc005
Dec 27 10:00:00 overcloud-computesriov-1 systemd-coredump[668152]: Process 667957 (podman) of user 0 dumped core.#012#012Stack trace of thread 668015:#012#0  0x0000562270f351e1 runtime.raise (podman)#012#1  0x0000562270f137b1 runtime.sigfwdgo (podman)#012#2  0x0000562270f11e14 runtime.sigtrampgo (podman)#012#3  0x0000562270f35583 runtime.sigtramp (podman)#012#4  0x00007f0e2e454b20 __restore_rt (libpthread.so.0)#012#5  0x0000562270f351e1 runtime.raise (podman)#012#6  0x0000562270efc6cd runtime.fatalpanic (podman)#012#7  0x0000562270efc005 runtime.gopanic (podman)#012#8  0x0000562271b0bd4f github.com/containers/podman/libpod/lock.(*SHMLock).Unlock (podman)#012#9  0x0000562271e58d98 github.com/containers/podman/libpod.(*Container).StopWithTimeout (podman)#012#10 0x000056227204a3f3 github.com/containers/podman/pkg/domain/infra/abi.(*ContainerEngine).ContainerStop.func1 (podman)#012#11 0x0000562271fad890 github.com/containers/podman/pkg/parallel/ctr.ContainerOp.func1 (podman)#012#12 0x0000562271e375d8 github.com/containers/podman/pkg/parallel.Enqueue.func1 (podman)#012#13 0x0000562270f339c1 runtime.goexit (podman)

We also not able to understand why podman restarted the container when I am trying to stop the container, check below outputs,

[root@overcloud-computesriov-1 ~]# podman ps | grep libvirt
4bc4078832a7  manager.ctlplane.example.com:8787/rhosp-rhel8/openstack-nova-libvirt:16.2                kolla_start           10 days ago     Up 17 hours ago            nova_virtlogd
038a939cedc7  manager.ctlplane.example.com:8787/rhosp-rhel8/openstack-nova-libvirt:16.2                kolla_start           10 days ago     Up 17 hours ago            nova_libvirt
[root@overcloud-computesriov-1 ~]#
[root@overcloud-computesriov-1 ~]#
[root@overcloud-computesriov-1 ~]# podman stop nova_libvirt
ERRO[0000] Failed to remove paths: map[hugetlb:/sys/fs/cgroup/hugetlb/machine.slice/libpod-038a939cedc7eccd0c05e0372a84a238a823fddc9558e92f0b004aea673368d6.scope name=systemd:/sys/fs/cgroup/systemd/machine.slice/libpod-038a939cedc7eccd0c05e0372a84a238a823fddc9558e92f0b004aea673368d6.scope pids:/sys/fs/cgroup/pids/machine.slice/libpod-038a939cedc7eccd0c05e0372a84a238a823fddc9558e92f0b004aea673368d6.scope]
038a939cedc7eccd0c05e0372a84a238a823fddc9558e92f0b004aea673368d6

# podman restarted here
[root@overcloud-computesriov-1 ~]# podman ps | grep libvirt
4bc4078832a7  manager.ctlplane.example.com:8787/rhosp-rhel8/openstack-nova-libvirt:16.2                kolla_start           10 days ago     Up 17 hours ago            nova_virtlogd
038a939cedc7  manager.ctlplane.example.com:8787/rhosp-rhel8/openstack-nova-libvirt:16.2                kolla_start           10 days ago     Up 3 seconds ago           nova_libvirt

[root@overcloud-computesriov-1 ~]# podman stop nova_libvirt
038a939cedc7eccd0c05e0372a84a238a823fddc9558e92f0b004aea673368d6
panic: operation not permitted

goroutine 79 [running]:
panic(0x55b0faf20ba0, 0xc00038f810)
        /usr/lib/golang/src/runtime/panic.go:1064 +0x545 fp=0xc0003bdd78 sp=0xc0003bdcb0 pc=0x55b0f969eda5
github.com/containers/podman/libpod/lock.(*SHMLock).Unlock(0xc0003cdcd0)
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/libpod/lock/shm_lock_manager_linux.go:121 +0x8f fp=0xc0003bdda8 sp=0xc0003bdd78 pc=0x55b0fa2a310f
  :
  :
  :
        /usr/lib/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc0000137d8 sp=0xc0000137d0 pc=0x55b0f96d66c1
created by github.com/containers/podman/vendor/github.com/cri-o/ocicni/pkg/ocicni.initCNI
        /builddir/build/BUILD/containers-podman-ad1aaba/_build/src/github.com/containers/podman/vendor/github.com/cri-o/ocicni/pkg/ocicni/ocicni.go:250 +0x3b1
Aborted (core dumped)
[root@overcloud-computesriov-1 ~]# podman ps -a | grep libvirt
4bc4078832a7  manager.ctlplane.example.com:8787/rhosp-rhel8/openstack-nova-libvirt:16.2                kolla_start           10 days ago    Up 17 hours ago                 nova_virtlogd
038a939cedc7  manager.ctlplane.example.com:8787/rhosp-rhel8/openstack-nova-libvirt:16.2                kolla_start           10 days ago    stopping                        nova_libvirt
84e1b8b1aafb  manager.ctlplane.example.com:8787/rhosp-rhel8/openstack-nova-libvirt:16.2                /bin/bash -c /usr...  10 days ago    Exited (0) 10 days ago          nova_libvirt_init_secret
[root@overcloud-computesriov-1 ~]#

After this I can't perform any action on container, check below output,

[root@overcloud-computesriov-1 ~]# podman stop nova_libvirt
Error: can only stop created or running containers. 038a939cedc7eccd0c05e0372a84a238a823fddc9558e92f0b004aea673368d6 is in state stopping: container state improper
[root@overcloud-computesriov-1 ~]#
[root@overcloud-computesriov-1 ~]# podman start nova_libvirt
Error: unable to start container "038a939cedc7eccd0c05e0372a84a238a823fddc9558e92f0b004aea673368d6": container 038a939cedc7eccd0c05e0372a84a238a823fddc9558e92f0b004aea673368d6 must be in Created or Stopped state to be started: container state improper
[root@overcloud-computesriov-1 ~]#

For now reboot is the only solution to bring container in proper state.

mheon commented 2 years ago

Are you sure it's Podman restarting the container? Openstack makes extensive use of systemd-managed containers and I view it as likely that this is one of them, so systemd may have decided to restart the container on detecting it stopping.

The Stopping state issue should be fixed in newer Podmans and backported to the 3.0 stream used by Openstack but I'm not sure when that fix will be arriving.

mheon commented 2 years ago

Regardless, I strongly recommend you open a BZ about this. There are a lot more moving parts here than just Podman (in an Openstack environment, Podman is heavily orchestrated by systemd) and I think we'll need debugging assistance from the OSP team on this.

github-actions[bot] commented 2 years ago

A friendly reminder that this issue had no activity for 30 days.

vikram077 commented 2 years ago

Root Cause

It is due to CGroup is not properly cleaning when user stopping the container. So when lib_vir container started again then it tries to create new CGroup and conflicts with the previous one. Issue only occurs when VM is running in that compute node.

Solution

  1. Ensure no VM is running in your compute node.
  2. Install systemd-container
  3. Make sure systemd-machined.service is up and running.

In RHOSP 16.1, systemd-machined is running be default, while in RHOSP 16.2 systemd-machined is not runinng, so need to start manually.