portal-api crashes if connection to Postgres times out

In case portal-api cannot connect to the Postgres database it needs within 10 seconds (the predefined timeout for connecting to Postgres), the portal-api process crashes and needs to be restarted. Stack traces with enabled LOG_LEVEL=debug can look like this (this is a connect error when checking the webhook queue during a "cron" job):

error: [+3743ms] portal-api:webhooks            *** COULD NOT GET WEBHOOKS
error: [+   0ms] portal-api:dao:pg:webhooks     ERROR dispatching webhook events
error: [+   0ms] portal-api:dao:pg:webhooks     {}
Error: timeout exceeded when trying to connect
    at Timeout.setTimeout [as _onTimeout] (/usr/src/app/node_modules/pg-pool/index.js:165:27)
    at ontimeout (timers.js:424:11)
    at tryOnTimeout (timers.js:288:5)
    at listOnTimeout (timers.js:251:5)
    at Timer.processTimers (timers.js:211:10)
/usr/src/app/node_modules/async/dist/async.js:966
        if (fn === null) throw new Error("Callback was already called.");
                         ^

Error: Callback was already called.
    at /usr/src/app/node_modules/async/dist/async.js:966:32
    at /usr/src/app/node_modules/async/dist/async.js:3885:13
    at poolOrClient.query (/usr/src/app/dao/postgres/pg-utils.js:343:24)
    at Object.connect [as callback] (/usr/src/app/node_modules/pg-pool/index.js:253:16)
    at /usr/src/app/node_modules/pg-pool/index.js:170:18
    at client.connect (/usr/src/app/node_modules/pg-pool/index.js:219:9)
    at Connection.con.once (/usr/src/app/node_modules/pg/lib/client.js:182:11)
    at Object.onceWrapper (events.js:273:13)
    at Connection.emit (events.js:182:13)
    at Socket.<anonymous> (/usr/src/app/node_modules/pg/lib/connection.js:76:10)

A retry mechanism should be implemented in these cases; this was seen with Azure Postgres as a service, but almost never with other types of Postgres implementations (such as running a Postgres as a pod). Nonetheless, this must be addressed.

Observed with 1.0.0.beta3.

Haufe-Lexware / wicked.haufe.io

portal-api crashes if connection to Postgres times out #119