containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.04k stars 2.35k forks source link

podman 5.2.0 does not close connections properly on MacOS #23616

Open maciej-szlosarczyk opened 1 month ago

maciej-szlosarczyk commented 1 month ago

Issue Description

After upgrade to 5.2.0 gvproxy keeps connections open for a long time I noticed it while running tests against postgresql 11 running in a container. After a few runs postgres would accumulate enough connections and memory usage that it would either get killed due to memory limits or would report that too many connections are open.

This is the output of lsof from both the host and inside the podman machine ten minutes after I ran the test:

podman-machine $ lsof -nP -iTCP

rootlessp 2572 core    4u  IPv6  12796      0t0  TCP 192.168.127.2:5432->192.168.127.1:26886 (ESTABLISHED)
rootlessp 2572 core   11u  IPv6  12698      0t0  TCP *:5432 (LISTEN)
rootlessp 2572 core   12u  IPv6  12797      0t0  TCP 192.168.127.2:5432->192.168.127.1:26884 (ESTABLISHED)
rootlessp 2572 core   20u  IPv6  12042      0t0  TCP 192.168.127.2:5432->192.168.127.1:26887 (ESTABLISHED)
rootlessp 2572 core   22u  IPv6  12043      0t0  TCP 192.168.127.2:5432->192.168.127.1:26885 (ESTABLISHED)
rootlessp 2572 core   30u  IPv6  12799      0t0  TCP 192.168.127.2:5432->192.168.127.1:26888 (ESTABLISHED)
rootlessp 2572 core   37u  IPv6  12801      0t0  TCP 192.168.127.2:5432->192.168.127.1:26890 (ESTABLISHED)
rootlessp 2572 core   38u  IPv6  12802      0t0  TCP 192.168.127.2:5432->192.168.127.1:26893 (ESTABLISHED)
rootlessp 2572 core   41u  IPv6  12805      0t0  TCP 192.168.127.2:5432->192.168.127.1:26889 (ESTABLISHED)
rootlessp 2572 core   42u  IPv6  12806      0t0  TCP 192.168.127.2:5432->192.168.127.1:26892 (ESTABLISHED)
rootlessp 2572 core   45u  IPv6  12809      0t0  TCP 192.168.127.2:5432->192.168.127.1:26891 (ESTABLISHED)
rootlessp 2572 core   83u  IPv6  12063      0t0  TCP 192.168.127.2:5432->192.168.127.1:26894 (ESTABLISHED)
rootlessp 2572 core   84u  IPv6  12064      0t0  TCP 192.168.127.2:5432->192.168.127.1:26895 (ESTABLISHED)
rootlessp 2572 core   87u  IPv6  14581      0t0  TCP 192.168.127.2:5432->192.168.127.1:26896 (ESTABLISHED)
rootlessp 2572 core  104u  IPv6  12069      0t0  TCP 192.168.127.2:5432->192.168.127.1:26899 (ESTABLISHED)
rootlessp 2572 core  105u  IPv6  12070      0t0  TCP 192.168.127.2:5432->192.168.127.1:26897 (ESTABLISHED)
rootlessp 2572 core  107u  IPv6  14587      0t0  TCP 192.168.127.2:5432->192.168.127.1:26898 (ESTABLISHED)
rootlessp 2572 core  109u  IPv6  14588      0t0  TCP 192.168.127.2:5432->192.168.127.1:26900 (ESTABLISHED)
rootlessp 2572 core  128u  IPv6  13660      0t0  TCP 192.168.127.2:5432->192.168.127.1:26901 (ESTABLISHED)
rootlessp 2572 core  134u  IPv6  13661      0t0  TCP 192.168.127.2:5432->192.168.127.1:26903 (ESTABLISHED)
rootlessp 2572 core  136u  IPv6  14598      0t0  TCP 192.168.127.2:5432->192.168.127.1:26902 (ESTABLISHED)
rootlessp 2572 core  137u  IPv6  14599      0t0  TCP 192.168.127.2:5432->192.168.127.1:26904 (ESTABLISHED)
rootlessp 2572 core  140u  IPv6  12838      0t0  TCP 192.168.127.2:5432->192.168.127.1:26906 (ESTABLISHED)
rootlessp 2572 core  141u  IPv6  12839      0t0  TCP 192.168.127.2:5432->192.168.127.1:26905 (ESTABLISHED)
rootlessp 2572 core  142u  IPv6  12840      0t0  TCP 192.168.127.2:5432->192.168.127.1:26913 (ESTABLISHED)
rootlessp 2572 core  146u  IPv6  12843      0t0  TCP 192.168.127.2:5432->192.168.127.1:26916 (ESTABLISHED)
rootlessp 2572 core  147u  IPv6  12844      0t0  TCP 192.168.127.2:5432->192.168.127.1:26911 (ESTABLISHED)
rootlessp 2572 core  150u  IPv6  14602      0t0  TCP 192.168.127.2:5432->192.168.127.1:26914 (ESTABLISHED)
rootlessp 2572 core  151u  IPv6  14603      0t0  TCP 192.168.127.2:5432->192.168.127.1:26908 (ESTABLISHED)
rootlessp 2572 core  154u  IPv6  12848      0t0  TCP 192.168.127.2:5432->192.168.127.1:26915 (ESTABLISHED)
rootlessp 2572 core  155u  IPv6  12849      0t0  TCP 192.168.127.2:5432->192.168.127.1:26907 (ESTABLISHED)
rootlessp 2572 core  158u  IPv6  12852      0t0  TCP 192.168.127.2:5432->192.168.127.1:26912 (ESTABLISHED)
rootlessp 2572 core  159u  IPv6  12853      0t0  TCP 192.168.127.2:5432->192.168.127.1:26909 (ESTABLISHED)
rootlessp 2572 core  162u  IPv6  14606      0t0  TCP 192.168.127.2:5432->192.168.127.1:26910 (ESTABLISHED)
rootlessp 2572 core  244u  IPv6  12100      0t0  TCP 192.168.127.2:5432->192.168.127.1:26917 (ESTABLISHED)
macos $ lsof -nP -iTCP

gvproxy   60061 maciej   11u  IPv4 0x5e926f38c5bfb168      0t0  TCP 127.0.0.1:50705 (LISTEN)
gvproxy   60061 maciej   16u  IPv4 0x953406e4466c535d      0t0  TCP 127.0.0.1:50705->127.0.0.1:50737 (CLOSED)
gvproxy   60061 maciej   19u  IPv4 0x3789059812058dd1      0t0  TCP 127.0.0.1:50705->127.0.0.1:50739 (CLOSED)
gvproxy   60061 maciej   22u  IPv4  0x9682450022e7d02      0t0  TCP 127.0.0.1:50705->127.0.0.1:50873 (CLOSED)
gvproxy   60061 maciej   23u  IPv4 0x1be1c29eac5d6858      0t0  TCP 127.0.0.1:50705->127.0.0.1:50875 (CLOSED)
gvproxy   60061 maciej   25u  IPv4 0xbd8071c31497745e      0t0  TCP 127.0.0.1:50705->127.0.0.1:50740 (CLOSED)
gvproxy   60061 maciej   29u  IPv4 0x9454279b154def93      0t0  TCP 127.0.0.1:50705->127.0.0.1:50881 (CLOSED)
gvproxy   60061 maciej   30u  IPv4 0x61dc385e10861b30      0t0  TCP 127.0.0.1:50705->127.0.0.1:50882 (CLOSED)
gvproxy   60061 maciej   31u  IPv6 0x1c7ced5e0ea769f9      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50783 (CLOSED)
gvproxy   60061 maciej   32u  IPv6 0xf51794d53ac7bf61      0t0  TCP *:5432 (LISTEN)
gvproxy   60061 maciej   35u  IPv6 0x8ee1e59331d9b4eb      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50784 (CLOSED)
gvproxy   60061 maciej   36u  IPv6 0x4461e9b487ca5038      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50785 (CLOSED)
gvproxy   60061 maciej   37u  IPv6 0x654daf1c1b524c81      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50786 (CLOSED)
gvproxy   60061 maciej   40u  IPv6 0xbd1f8da5525d8ea9      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50787 (CLOSED)
gvproxy   60061 maciej   41u  IPv6 0xf4ee4b56e2fe6d28      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50788 (CLOSED)
gvproxy   60061 maciej   42u  IPv6 0xebefcd2083efd482      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50789 (CLOSED)
gvproxy   60061 maciej   43u  IPv6 0x2688979dd0323152      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50790 (CLOSED)
gvproxy   60061 maciej   45u  IPv6 0xa80ac723feddf84b      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50791 (CLOSED)
gvproxy   60061 maciej   46u  IPv6 0x4422ea55d7b9df09      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50792 (CLOSED)
gvproxy   60061 maciej   47u  IPv6 0xa8a2e8f877895393      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50793 (CLOSED)
gvproxy   60061 maciej   48u  IPv6 0x17117e6d683e509c      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50794 (CLOSED)
gvproxy   60061 maciej   49u  IPv6 0x4eddffe61685b75d      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50795 (CLOSED)
gvproxy   60061 maciej   50u  IPv6 0x494828cb4fb5d6d8      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50796 (CLOSED)
gvproxy   60061 maciej   51u  IPv6 0x56b1d31fdbef1026      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50797 (CLOSED)
gvproxy   60061 maciej   52u  IPv6  0x7224088b59fca3f      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50798 (CLOSED)
gvproxy   60061 maciej   53u  IPv6 0xd5816cb26426e997      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50799 (CLOSED)
gvproxy   60061 maciej   54u  IPv6 0xabc19d0040b80509      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50800 (CLOSED)
gvproxy   60061 maciej   55u  IPv6 0x208a3e5b83626444      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50802 (CLOSED)
gvproxy   60061 maciej   56u  IPv6 0x90ebbedee86163dd      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50801 (CLOSED)
gvproxy   60061 maciej   57u  IPv6 0x1929cc69b5caf122      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50804 (CLOSED)
gvproxy   60061 maciej   58u  IPv6 0x12271b9704d54c09      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50803 (CLOSED)
gvproxy   60061 maciej   59u  IPv6 0x515b09d096f5578e      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50805 (CLOSED)
gvproxy   60061 maciej   60u  IPv6 0xd47d82c2b4b907e9      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50806 (CLOSED)
gvproxy   60061 maciej   61u  IPv6 0xbe3dcde786c4c750      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50807 (CLOSED)
gvproxy   60061 maciej   62u  IPv6 0xbcb65b3f5988ad79      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50808 (CLOSED)
gvproxy   60061 maciej   63u  IPv6 0x45363382b4ed0203      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50809 (CLOSED)
gvproxy   60061 maciej   64u  IPv6 0xe19711d3fee821f2      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50811 (CLOSED)
gvproxy   60061 maciej   65u  IPv6 0x90b7bfcdd1f0f495      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50810 (CLOSED)
gvproxy   60061 maciej   66u  IPv6 0x432cfe824e976f6b      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50812 (CLOSED)
gvproxy   60061 maciej   67u  IPv6 0xe8d4fd3eb26bb504      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50813 (CLOSED)
gvproxy   60061 maciej   68u  IPv6 0x7b56d3044aedd9dc      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50814 (CLOSED)
gvproxy   60061 maciej   69u  IPv6  0xa79808b077ea8e9      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50815 (CLOSED)
gvproxy   60061 maciej   70u  IPv6  0xea3f01278f01df8      0t0  TCP [::127.0.0.1]:5432->[::127.0.0.1]:50816 (CLOSED)

Looks like a regression between 5.1.2 and 5.2.0. Downgrading back to 5.1.2 fixes this issue.

Steps to reproduce the issue

  1. Create a container running docker.io/library/postgres:11.
  2. Run high concurrency tests against the container (I ran 20 concurrent connections, postgres was configured to 100 total)
  3. After a while postgres will start running out of connections.

Describe the results you received

Connections/File descriptors should be closed properly once they're closed by the client.

Describe the results you expected

Describe the results you expected

podman info output

host:
  arch: arm64
  buildahVersion: 1.37.0
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.10-1.fc40.aarch64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.10, commit: '
  cpuUtilization:
    idlePercent: 99.23
    systemPercent: 0.43
    userPercent: 0.35
  cpus: 4
  databaseBackend: sqlite
  distribution:
    distribution: fedora
    variant: coreos
    version: "40"
  eventLogger: journald
  freeLocks: 2046
  hostname: localhost.localdomain
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
    uidmap:
    - container_id: 0
      host_id: 501
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
  kernel: 6.9.11-200.fc40.aarch64
  linkmode: dynamic
  logDriver: journald
  memFree: 458776576
  memTotal: 2044641280
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.12.0-1.20240801182221740927.main.8.g7cc82a4.fc40.aarch64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.13.0-dev
    package: netavark-1.12.1-1.20240802123603627403.main.4.geb4e5d3.fc40.aarch64
    path: /usr/libexec/podman/netavark
    version: netavark 1.13.0-dev
  ociRuntime:
    name: crun
    package: crun-1.15-1.20240731144341062436.main.62.ge6a1ef1.fc40.aarch64
    path: /usr/bin/crun
    version: |-
      crun version UNKNOWN
      commit: ce56cc2473f934a77c79c5cc79a65fe157dec42d
      rundir: /run/user/501/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20240624.g1ee2eca-1.fc40.aarch64
    version: |
      pasta 0^20240624.g1ee2eca-1.fc40.aarch64-pasta
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: true
    path: /run/user/501/podman/podman.sock
  rootlessNetworkCmd: pasta
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.2-2.fc40.aarch64
    version: |-
      slirp4netns version 1.2.2
      commit: 0ee2d87523e906518d34a6b423271e4826f71faf
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.5
  swapFree: 0
  swapTotal: 0
  uptime: 0h 23m 47.00s
  variant: v8
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /var/home/core/.config/containers/storage.conf
  containerStore:
    number: 1
    paused: 0
    running: 1
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/home/core/.local/share/containers/storage
  graphRootAllocated: 53082042368
  graphRootUsed: 3940200448
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 1
  runRoot: /run/user/501/containers
  transientStore: false
  volumePath: /var/home/core/.local/share/containers/storage/volumes
version:
  APIVersion: 5.2.0
  Built: 1722556800
  BuiltTime: Fri Aug  2 03:00:00 2024
  GitCommit: ""
  GoVersion: go1.22.5
  Os: linux
  OsArch: linux/arm64
  Version: 5.2.0

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

podman version:

Client:       Podman Engine
Version:      5.2.0
API Version:  5.2.0
Go Version:   go1.22.6
Git Commit:   b22d5c61eef93475413724f49fd6a32980d2c746
Built:        Thu Aug  1 20:32:02 2024
OS/Arch:      darwin/arm64

Server:       Podman Engine
Version:      5.2.0
API Version:  5.2.0
Go Version:   go1.22.5
Built:        Fri Aug  2 03:00:00 2024
OS/Arch:      linux/arm64

Additional information

Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting

Luap99 commented 4 weeks ago

Can you replace only gvproxy binary with the one included in in 5.1.2 installer, should be version gvproxy 0.7.3, we updated the version to 0.7.4 in the 5.2 installer (https://github.com/containers/gvisor-tap-vsock/releases). So I wonder if there is a regression there or on the VM side somehow.

maciej-szlosarczyk commented 4 weeks ago

Hey @Luap99!

It works properly with gvproxy 0.7.3:

% ./podman/5.2.0/libexec/podman/gvproxy --version
gvproxy version v0.7.3

% lsof -nP -iTCP | grep gvproxy
gvproxy   63436 maciej   11u  IPv4 0x921c900cb3b2c2c3      0t0  TCP 127.0.0.1:57485 (LISTEN)
gvproxy   63436 maciej   34u  IPv6 0x9a475d1349cde53d      0t0  TCP *:5432 (LISTEN)
gvproxy   63436 maciej   71u  IPv4 0x776bd7258f5958e0      0t0  TCP 127.0.0.1:57485->127.0.0.1:58106 (ESTABLISHED)

Machine internal:

core@localhost:~$ lsof -nP -iTCP
COMMAND    PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
rootlessp 2568 core   11u  IPv6  13502      0t0  TCP *:5432 (LISTEN)
Luap99 commented 4 weeks ago

@maciej-szlosarczyk Do you know how to use git bisect? It would be great if you could build gvproxy from source (make gvproxy in https://github.com/containers/gvisor-tap-vsock/) then copy the binary to your location and test again until you find the commit that caused the regression.

maciej-szlosarczyk commented 4 weeks ago

Bisect points at this commit:

https://github.com/containers/gvisor-tap-vsock/commit/600910caefc729efaaae16219be51d081284a104

I managed to narrow it down to these 15 or so lines:

https://github.com/containers/gvisor-tap-vsock/commit/600910caefc729efaaae16219be51d081284a104#diff-a888e5fb95e32a55bcba169802125c5ea2dbd00cf18815f1edfe4faa3e6a2d69L376-R408

Luap99 commented 4 weeks ago

Thanks

cc @praveenkumar @cfergeau

maciej-szlosarczyk commented 4 weeks ago

Hey!

I did some additional investigation there and discovered two things:

praveenkumar commented 3 weeks ago

@maciej-szlosarczyk Can you file a PR for it to https://github.com/containers/gvisor-tap-vsock repo?

cfergeau commented 3 weeks ago

The lines you pointed to as causing the regression are related to this inetaf/tcpproxy commit https://github.com/inetaf/tcpproxy/commit/2862066fc2a9405880f212f71230425bdfe9950e

I think what happens with this commit is that before this commit HandleConn was returning after either the src or dest connection was closed/returned an error. My guess is that after HandleConn returns, both connections were eventually garbage collected.

After commit 2862066fc2a9405880 however, the code now waits for both src and dest to be closed or return an error before HandleConn returns. I suspect when gvproxy shows the bug, a lot of go routines may be stuck in this HandleConn method.

cfergeau commented 3 weeks ago

https://github.com/containers/gvisor-tap-vsock/pull/386 is merged and should address this issue. I'll make a gvisor-tap-vsock release soon. It's likely that it's only papering over a gvisor-tap-vsock or tcpproxy bug, but if it's a preexisting gvisor-tap-vsock bug made more visible by the changes in https://github.com/containers/gvisor-tap-vsock/commit/600910caefc729efaaae16219be51d081284a104, then it's a bug which has been present for years, so we can live with it for a while longer.

Luap99 commented 3 weeks ago

Sorry I was ooto, we did a new release yesterday. But if you do a new gvproxy release we can certainly do another release soon to fix this regression.