wearefrank / zaakbrug

An app for Dutch municipalities that supports the transition from "zaak- en documentatieservices" (zds) to "zaakgericht werken" (zgw).
https://zaakbrug.nl
European Union Public License 1.2
5 stars 5 forks source link

zaakBrug lost connection with Postgresql (and didn't reconnect) #160

Open Geert-Jan-DH opened 12 months ago

Geert-Jan-DH commented 12 months ago

On two occasions this week we had a "this connection has been closed" in ladybug calling when calling "genereerDocumentIdentificatie_Di02". Resulting in an erromessage to the calling application.

Generating a zaakIdentificatie had the same problem.

One incident on our test-environment and one incident on our production environment.

The problem was mitigated by restarting the server.

Other servers in the same Azure cloud using Postgresql never seem to have these issues.

Due to the immediate restart, logs are not available any more.

MLenterman commented 12 months ago

Does this also happen with the the latest (1.13.4) version? This issue is caused by the application running out of memory. The latest version of ZaakBrug includes an upgrade from Java 8 to Java 11. That should should help significantly with memory usage and memory behavior in containerized environments. Alternatively assigning more memory to ZaakBrug should also remedy this issue.

I will still mark it as a bug, because a loss of connection to the database should make the health endpoint return UNHEALTHY, so that container host can restart it.

Geert-Jan-DH commented 12 months ago

Version in both environments isn't 1.13.4 yet.

MLenterman commented 12 months ago

ibissource/iaf#5441

checkiecheck commented 8 months ago

Does this also happen with the the latest (1.13.4) version? This issue is caused by the application running out of memory. The latest version of ZaakBrug includes an upgrade from Java 8 to Java 11. That should should help significantly with memory usage and memory behavior in containerized environments. Alternatively assigning more memory to ZaakBrug should also remedy this issue.

I will still mark it as a bug, because a loss of connection to the database should make the health endpoint return UNHEALTHY, so that container host can restart it.

related to #236 where we'll address more memory settings too