treasure-data / digdag

Workload Automation System
https://www.digdag.io/
Apache License 2.0
1.31k stars 222 forks source link

digdag server made too many Postgres connections on my machine #478

Closed kitsuyui closed 7 years ago

kitsuyui commented 7 years ago

https://github.com/treasure-data/digdag/pull/280/commits/780d61780ae5658547b87b44222a870942ebc715#diff-d11545055f57d9e418a11dd1e06f680d

        int maximumPoolSize = config.get("database.maximumPoolSize", int.class,
                Runtime.getRuntime().availableProcessors() * 32); // HikariCP default: 10
example

My machine has 32 cores and digdag tried to make 1024 connections. It exceeded to my Postgres's max_connections. And digdag occupied every postgres connection on the server. Other processes couldn't make new connect/reconnect. It had broken other jobs on server.

I had written database.maximumPoolSize = 10 in postgresql.properties and it has stopped.

To prevent this tragedy on others

There are various ways:

  1. Limit maximumPoolSize to some percentage of max connections parameter. (by getting SHOW max_connections)
  2. Warning when it exceeds.
  3. More document about maximumPoolSize and urge to set maximumPoolSize when user creating postgres.properties file (or encourage to setup pgpool).
frsyuki commented 7 years ago

Currently maximum pool size can't be small because it causes deadlock when current code establishes a connection when another transaction is running in a single thread. Pool size must be unlimited to avoid the deadlock. Meanwhile, a solution is to use smaller number of threads.

466 is the fundamental fix of it.

kitsuyui commented 7 years ago

Thanks, I look forward to be released #466 ! 😄

komamitsu commented 7 years ago

@kitsuyui We released https://github.com/treasure-data/digdag/pull/515 instead of #466 as 0.9.8. It would be great if you try it

kitsuyui commented 7 years ago

@komamitsu @frsyuki

I tried v0.9.16. It was certainly mitigated. (1024 connections to 80 connections) But it is connections all it has remained in my machine. So I have to limit maximumConnectionPool in postgres.properties explicitly.

It is not so good feeling about this...

frsyuki commented 7 years ago

use small connection pool size by setting database.maximumPoolSize (integer, default: available CPU cores * 32)

brettwooldridge commented 7 years ago

@frsyuki これはどう?