Open write0nly opened 3 years ago
for the record this seems to happen due to the connection pooling which is too small by default (unset?). If we set max_idle_connections > max_parallel
the connections are not torn down and there is no churn. It has the obvious down side of having many connections open, but maybe max_parallel
can be lowered too.
The following setting worked flawlessly.
max_idle_connections = 256
max_parallel = 128
IMHO this could become: 1- change the default so that max_idle_connections >= max_parallel 2- document this clearly on the postgresql backend page
for the record this seems to happen due to the connection pooling which is too small by default (unset?). If we set
max_idle_connections > max_parallel
the connections are not torn down and there is no churn. It has the obvious down side of having many connections open, but maybemax_parallel
can be lowered too.The following setting worked flawlessly.
max_idle_connections = 256 max_parallel = 128
IMHO this could become: 1- change the default so that max_idle_connections >= max_parallel 2- document this clearly on the postgresql backend page
Hi @write0nly,
This suggestion makes good sense to me, I'm all for it. I'm not sure when we'll get to it though, feel free to submit a PR if you get impatient.
Hi @write0nly - following up on Nick's comment, was this work that you'd be interested in taking up and filing a PR for? Please let us know how we can help. Thanks!
for the record this seems to happen due to the connection pooling which is too small by default (unset?). If we set
max_idle_connections > max_parallel
the connections are not torn down and there is no churn. It has the obvious down side of having many connections open, but maybemax_parallel
can be lowered too.The following setting worked flawlessly.
max_idle_connections = 256 max_parallel = 128
IMHO this could become: 1- change the default so that max_idle_connections >= max_parallel 2- document this clearly on the postgresql backend page
Thanks for this. We have a small setup with mysql backend and we faced the same issue. In our case, the following configuration also works smoothly
max_idle_connections = 10
max_parallel = 5
This issue was caught in QA/stress testing and is not really expected in a production environment, however could also be forced by users who can login to the vault server host.
Because vault (when using postgresql backend) does fast ephemeral connections to postgresql to create/delete leases and entries in the DB, if there are too many connections in TIME_WAIT or CLOSE_WAIT vault can cycle through the entirety of the port range available for connections and eventually run out of ports, printing the following type of error:
after some time of this happening vault errors and seals itself. In the case where vault tries to expire leases upon startup and has too many leases (let's say 200k it needs to expire) vault exhausts ports and then seals itself. This causes vault to become unusable because it keeps on re-sealing itself over and over again.
If the user is persistent and also unseals vault in a loop, then vault will reach a stable point when the number of leases goes below 10k, which can be seen with this query:
Steps to reproduce the behavior:
Have a vault cluster (tested on v1.7.2) using the postgresql storage backend. This was tested in a cluster but may also happen on a single vault.
Run
vault write auth/approle/login role_id=... secret_id=...
in a loop millions of times until we have 200k+ outstanding vault tokens that are going to expire.stop all vaults in the cluster, let's say due to an upgrade.
Make sure you have a large number of outstanding leases in the DB:
restart and unseal the vaults
Expected behavior
Observed behaviour