containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
22.99k stars 2.34k forks source link

volume leaking cause "error allocating lock for new container: allocation failed; exceeded num_lock" #14195

Closed ttys3 closed 2 years ago

ttys3 commented 2 years ago

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description I got this recently.

# podman run --rm -ti  alpine sh
Error: error allocating lock for new container: allocation failed; exceeded num_locks (2048)

concurrent to create about < 30 containers, then I got deadlock, all containers can not create now.

here's the containers created after one by one total:

# podman ps | wc -l
22

I changed num_locks limit from 2048 to 4096, but this does not help at all.

allocation failed; exceeded num_locks (4096)

after changed to num_locks = 8192, things seems to work.

I put some debug logging into podman code. so when it try to get a shm lock and success, it print a log with special prefix xxdebug.

when it try to get a shm lock and failed, it also print a log with special prefix xxdebug.

diff --git a/libpod/runtime_ctr.go b/libpod/runtime_ctr.go
index df7174ac6..68e9041c7 100644
--- a/libpod/runtime_ctr.go
+++ b/libpod/runtime_ctr.go
@@ -299,7 +299,7 @@ func (r *Runtime) setupContainer(ctx context.Context, ctr *Container) (_ *Contai
        }
        ctr.lock = lock
        ctr.config.LockID = ctr.lock.ID()
-       logrus.Debugf("Allocated lock %d for container %s", ctr.lock.ID(), ctr.ID())
+       logrus.Infof("xxdebug Allocated lock %d for container %s name: %s", ctr.lock.ID(), ctr.ID(), ctr.Name())

        defer func() {
                if retErr != nil {
@@ -774,6 +774,9 @@ func (r *Runtime) removeContainer(ctx context.Context, c *Container, force, remo
                } else {
                        logrus.Errorf("Free container lock: %v", err)
                }
+               logrus.Errorf("xxdebug Free container lock failed, %d for container %s name: %s err: %v", c.lock.ID(), c.ID(), c.Name(), err)
+       } else {
+               logrus.Infof("xxdebug Free container lock success, %d for container %s name: %s", c.lock.ID(), c.ID(), c.Name())
        }

I counted the log items total, it is 68 only, why does it need 8192 num_locks limit ?

Steps to reproduce the issue:

1.

2.

3.

Describe the results you received:

Describe the results you expected:

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

Client:       Podman Engine
Version:      4.2.0-dev
API Version:  4.2.0-dev
Go Version:   go1.18.1
Git Commit:   c379014ee4e57dc19669ae92f45f8e4c0814295b-dirty
Built:        Wed May 11 23:08:15 2022
OS/Arch:      linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.26.1
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  - rdma
  - misc
  cgroupManager: cgroupfs
  cgroupVersion: v2
  conmon:
    package: /usr/bin/conmon is owned by conmon 1:2.1.0-1
    path: /usr/bin/conmon
    version: 'conmon version 2.1.0, commit: '
  cpuUtilization:
    idlePercent: 96.93
    systemPercent: 0.94
    userPercent: 2.13
  cpus: 20
  distribution:
    distribution: arch
    version: unknown
  eventLogger: file
  hostname: wudeng
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.17.5-arch1-2
  linkmode: dynamic
  logDriver: k8s-file
  memFree: 14855860224
  memTotal: 33387909120
  networkBackend: cni
  ociRuntime:
    name: crun
    package: /usr/bin/crun is owned by crun 1.4.5-1
    path: /usr/bin/crun
    version: |-
      crun version 1.4.5
      commit: c381048530aa750495cf502ddb7181f2ded5b400
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /etc/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: /usr/bin/slirp4netns is owned by slirp4netns 1.2.0-1
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.4
  swapFree: 17179865088
  swapTotal: 17179865088
  uptime: 23m 10.17s
plugins:
  log:
  - k8s-file
  - none
  - passthrough
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  hub.k8s.lan:
    Blocked: false
    Insecure: true
    Location: hub.k8s.lan
    MirrorByDigestOnly: false
    Mirrors: null
    Prefix: hub.k8s.lan
    PullFromMirror: ""
  search:
  - docker.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 36
    paused: 0
    running: 21
    stopped: 15
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphRootAllocated: 178534772736
  graphRootUsed: 136645885952
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "true"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 63
  runRoot: /run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 4.2.0-dev
  Built: 1652281695
  BuiltTime: Wed May 11 23:08:15 2022
  GitCommit: c379014ee4e57dc19669ae92f45f8e4c0814295b-dirty
  GoVersion: go1.18.1
  Os: linux
  OsArch: linux/amd64
  Version: 4.2.0-dev

Package info (e.g. output of rpm -q podman or apt list podman):

build from main branch

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):

I use nomad podman driver to create containers, which maybe concurrent request libpod rest api.

Luap99 commented 2 years ago

@mheon PTAL

mheon commented 2 years ago

Do you have a reproducer? The locks code should be thread-safe so I don't see how this could be happening.

ttys3 commented 2 years ago

@mheon the containers was created by nomad-podman-driver (which actully call libpod api (maybe concurrently))

my envirement has 22 containers, each has different nomad file, it is not easy for me to build simple reproducer here.

but it will take time and I will try to provide the detailed reproducer

Luap99 commented 2 years ago

If you reboot it should clear all locks (maybe you could also try podman system renumber?)

Also note that not only containers have locks, pods and volumes also use them, so be sure to check them as well.

mheon commented 2 years ago

All container creations should log the number of the lock allocated:

DEBU[0000] Allocated lock 0 for container b987f4594a5d35495ef94829c42e8500cfbcab7ee87b846be23a07b5fe5de80f 

If that is a number in the thousands - potentially, we have lock exhaustion. If not, it is likely some sort of race in the lock allocation code that is being caused by parallel container creation. Again, though, the lock allocation code is itself locked for thread safety. I don't see how that could race.

ttys3 commented 2 years ago

If you reboot it should clear all locks (maybe you could also try podman system renumber?)

Also note that not only containers have locks, pods and volumes also use them, so be sure to check them as well.

yes, the problem comes every time I rebooted the machine.

because after system boot, it (it ref to nomad-driver-podman, which actully call podman rest API) will start or create many containers ( container num < 30 )

"not only containers have locks, pods and volumes also use them".

I do not have pods. only containers + volumes the max volumes num of all the containers is 11, lets think all the containers have 11 volumes, so the lock num should be (11+1)*22=264

ttys3 commented 2 years ago

I use num_locks = 8192 now , currently it works fine.

sudo podman ps | wc -l
21
rhatdan commented 2 years ago

Can we close this issue?

banool commented 2 years ago

I'm hitting this issue also, similar thing where the actual number of containers I have is very small:

$ podman container ls | wc -l
17
$ sudo podman container ls | wc -l
1

To remediate I set num_locks to 8192 then ran this:

sudo rm -rf /dev/shm/*
sudo podman system renumber
podman system renumber
sudo podman system prune
podman system prune

I will report in if the issue appears again.

In my use case I'm creating a container every few seconds with podman run, but as you can see above, they don't persist. Calling podman system renumber doesn't help. I am going to try running with the --rm flag to see if it helps.

mheon commented 2 years ago

Any chance you're leaking volumes? podman volume ls would show them

ttys3 commented 2 years ago

@mheon I checked volume, yes, it is leaking. so this is why "allocation failed; exceeded num_lock" happend.

sudo podman volume ls | wc -l
6617

that's why 4096 is not enough. (after changed to num_locks = 8192, it works fine)

but here there must be other problems.

if a container got removed, the volume should removed to.

because I have no explicit created volume using sudo podman volume create

like I previous menthioned:

I do not have pods. only containers + volumes the max volumes num of all the containers in one container is 11

I have 22 running containers, total containers (including dead) is 37:

so the lock num should be max to (11+1)*37=444

so the problem is, why there's so many volume leaking ?

May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eb6fc1881820f25a6dd643b0a6fbd8ee2220f63c3edbf1f88cbe906a56a18426: acquiring lock 30 for volume eb6fc1881820f25a6dd643b0a6fbd8ee2220f63c3edbf1f88cbe906a56a18426: file exists"
May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eb6ec2dbf9ef53f0fb54b400592e4650aac3efa1670dc27fa1ef787303c41846: acquiring lock 1883 for volume eb6ec2dbf9ef53f0fb54b400592e4650aac3efa1670dc27fa1ef787303c41846: file exists"
May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eb5ee3f804d2c402e9135d32c75db12f52acbf10057dabf6398030fe2cd1a6a4: acquiring lock 406 for volume eb5ee3f804d2c402e9135d32c75db12f52acbf10057dabf6398030fe2cd1a6a4: file exists"
May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eb4d9739079e0aec756d4664bf9e9d22d495dce591622ab657040dad956809df: acquiring lock 365 for volume eb4d9739079e0aec756d4664bf9e9d22d495dce591622ab657040dad956809df: file exists"
May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eb4489ef84ab517fe8d0b79a561e7b686194897fd6bd5937a2e102f514263504: acquiring lock 1882 for volume eb4489ef84ab517fe8d0b79a561e7b686194897fd6bd5937a2e102f514263504: file exists"
May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eb40a9dd1caf07ac30c60879aa31b6dea298d17a36ef159d123e67e6936f48f9: acquiring lock 1881 for volume eb40a9dd1caf07ac30c60879aa31b6dea298d17a36ef159d123e67e6936f48f9: file exists"
May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eb336a2a4e1053bd3183bfcc662e8d70c9b009ef954ace9a0a7bc7844504288e: acquiring lock 1880 for volume eb336a2a4e1053bd3183bfcc662e8d70c9b009ef954ace9a0a7bc7844504288e: file exists"
May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eb309917c7c77ca5ff0333df890846d544fd03e285cfe42dbc0a320d3fe45c0c: acquiring lock 1591 for volume eb309917c7c77ca5ff0333df890846d544fd03e285cfe42dbc0a320d3fe45c0c: file exists"
May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eb24f8ae6f7a72d7a31bcaa586367f34ae0b32f227759426aca4ac47e12668e8: acquiring lock 1879 for volume eb24f8ae6f7a72d7a31bcaa586367f34ae0b32f227759426aca4ac47e12668e8: file exists"
May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eb211bbb807be65ba678a76fdb7dc56cfae397beb971e936e7601c3031b91a7b: acquiring lock 336 for volume eb211bbb807be65ba678a76fdb7dc56cfae397beb971e936e7601c3031b91a7b: file exists"
May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eb1f3b76c39225ce516058761cd6ad4421f158c744be85f1ff0df33e21f28e74: acquiring lock 905 for volume eb1f3b76c39225ce516058761cd6ad4421f158c744be85f1ff0df33e21f28e74: file exists"
May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eb1c28972ad1376949c17d30ca17240c3ddfefe72d68a45c51fbfbad368cd6da: acquiring lock 2256 for volume eb1c28972ad1376949c17d30ca17240c3ddfefe72d68a45c51fbfbad368cd6da: file exists"
May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eb17aba2770c716521052a30dc33fd9e51ff5fec7bcfe875cb078e409f5fb0cb: acquiring lock 315 for volume eb17aba2770c716521052a30dc33fd9e51ff5fec7bcfe875cb078e409f5fb0cb: file exists"
May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eb0a45a8250853493586323899b572ce5d73509c05383573eb3d3ad16deadad6: acquiring lock 656 for volume eb0a45a8250853493586323899b572ce5d73509c05383573eb3d3ad16deadad6: file exists"
May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eb02ba6e7c3e7af197e65aba528ac4a83643eaef00574c53690073ff5d9b975d: acquiring lock 1952 for volume eb02ba6e7c3e7af197e65aba528ac4a83643eaef00574c53690073ff5d9b975d: file exists"
May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eaf40ab08f05396fc7716bfcdc01ed6ab59338e121c6eee5fe62837eb1e1ff32: acquiring lock 999 for volume eaf40ab08f05396fc7716bfcdc01ed6ab59338e121c6eee5fe62837eb1e1ff32: file exists"
May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eaf2ffb4eed2b0504c00fd74619db4a70eeeb2bf842dc6b85f2d9f14d44a664b: acquiring lock 1095 for volume eaf2ffb4eed2b0504c00fd74619db4a70eeeb2bf842dc6b85f2d9f14d44a664b: file exists"
May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eaca755c75247623ed7a70951b312f6a652f48350a5ad59873e4e9308a4c00c0: acquiring lock 56 for volume eaca755c75247623ed7a70951b312f6a652f48350a5ad59873e4e9308a4c00c0: file exists"
May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eac3f11ed81158ccbd6c34fac618f03f5920d1425efee330b2bf5f13309b4297: acquiring lock 2059 for volume eac3f11ed81158ccbd6c34fac618f03f5920d1425efee330b2bf5f13309b4297: file exists"
May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eabeb5b5530c27c6543187038c8fd2c045bb137271158d35a17af925184fdb1a: acquiring lock 1878 for volume eabeb5b5530c27c6543187038c8fd2c045bb137271158d35a17af925184fdb1a: file exists"
May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eabd9dd2b84e9635390bce95b9cfbc408914a1e0797e1eb7212dba0ef1e6af08: acquiring lock 43 for volume eabd9dd2b84e9635390bce95b9cfbc408914a1e0797e1eb7212dba0ef1e6af08: file exists"
May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eab5dc2bc6026d28bd591dfa9ddfc8ad7ce77188595cc872ce1a0a722b7d2b41: acquiring lock 2097 for volume eab5dc2bc6026d28bd591dfa9ddfc8ad7ce77188595cc872ce1a0a722b7d2b41: file exists"
May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eab5846a336ae075f420a5b402f41f8c8343e6f0ef7d027f3e176a3dcbe77f14: acquiring lock 1877 for volume eab5846a336ae075f420a5b402f41f8c8343e6f0ef7d027f3e176a3dcbe77f14: file exists"
May 15 22:56:19 user001 podman[1755]: time="2022-05-15T22:56:19+08:00" level=error msg="Refreshing volume eaa73b5fd12b152ed0af93ad689f656522df954296d200dc135d20b23616906f: acquiring lock 1876 for volume eaa73b5fd12b152ed0af93ad689f656522df954296d200dc135d20b23616906f: file exists"
mheon commented 2 years ago

The containers in question are set to autoremove, right? Autoremove via CLI should be removing volumes (had to check the code, but it's definitely set to remove dependent volumes); I suppose it's possible that autoremove via API might not be doing so? Any chance you can verify that volumes only leak via API?

ttys3 commented 2 years ago

no. all the container is created via libpod REST api. and not have autoremove options enabled

after sudo podman volume prune, now the volumes only have 12:

❯ sudo podman volume ls | wc -l
12

so there's 6605 volumes leaking.

I checked that only have 10+ dead containers, so it can not leak so much volume.

The leaking volume must be taken a rather long time to reach this situation.

mheon commented 2 years ago

I assume you're in a situation where there is container churn (containers are regularly created/removed)? If so, how are you removing them?

github-actions[bot] commented 2 years ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 2 years ago

@ttys3 Is this still an issue or has it gone away?

github-actions[bot] commented 2 years ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 2 years ago

Since I never heard back, I am assuming this is fixed, closing