docker-library / postgres

Docker Official Image packaging for Postgres
http://www.postgresql.org
MIT License
2.16k stars 1.13k forks source link

Document / adjust defaults for possible `connection (already) closed` issue when used on Swarm (due to IPVS) #1110

Open u1735067 opened 1 year ago

u1735067 commented 1 year ago

The issue https://github.com/docker-library/postgres/issues/538 introduced a warning in the README about the Docker Swarm IPVS LB that timeouts TCP connections after 900 secs, which is lower than tcp_keepalive_time, so idle connections might become unavailable (cut by IPVS) while still being needed later, and that warning have since been removed (https://github.com/docker-library/docs/commit/5e28015ab2d9039a28daca5f7d65be996eb39234), probably due to the link being broken.

Could it be possible to reintroduce a warning about this, and/or to propose a default -ctcp_keepalives_idle=870 value (for example; and maybe tcp_keepalives_interval+tcp_keepalives_count)?

The old "success" documentation is visible at https://web.archive.org/web/20200611114911/https://success.docker.com/article/ipvs-connection-timeout-issue.

Possible solutions are:

Example of all solutions (only one needed):

services:
  postgres:
    command:
      - postgres
      - -ctcp_keepalives_idle=300  # < 900
      # Maybe this too ?
      # - -ctcp_keepalives_interval=30
      # - -ctcp_keepalives_count=5
    sysctls:
      net.ipv4.tcp_keepalive_time: 720  # < 900
    deploy:
      endpoint_mode: dnsrr  # The client should resolve on each connection in case the task (IP) changed

  my_other_service:
    environment:
      POSTGRES_HOST: tasks.postgres  # The client should resolve on each connection in case the task (IP) changed
tianon commented 8 months ago

Sorry for the delay! :sob:

Unfortunately, I think tuning PostgreSQL for use within Swarm is probably out of scope for this repository. :see_no_evil: :disappointed:

Maybe we can add back a really small blurb about the problem in the docs, perhaps using this issue as our link instead of that old success article?