Intevation / intelmq-fody-backend

A backend to serve intelmq-cb-mailgen data for the webapp fody.
2 stars 2 forks source link

Reopen postgresql db connection to be more robust #12

Open bernhardreiter opened 7 years ago

bernhardreiter commented 7 years ago

ffb268e325d0f6346100fc2fe21b5d24818e27f0 v0.5.1 will fail if a database connection is closed. Apache's mod_wsgi may restart the processes onces in the while, so there is some self healing.

Simple test to reproduce: systemctl restart postgresql.service in a running contactdb_api results into

cur = contactdb_conn.cursor(cursor_factory=RealDictCursor)
psycopg2.InterfaceError: connection already closed

Technical idea:

Use psycopg2.pool.ThreadedConnectionPool().

This might also help to speed up access and use multiple threads with wsgi, which would need an update in the Apache config instructions.

bernhardreiter commented 7 years ago

Did some planning and also talked to @bernhard-herzog:

This is a task which requires some code changes. Thoughts:

To make the code easier to read it would be good to use a function to get new connections. One way is to pass the connection or cursor object down to the called functions. If we want this to be implicit, someone could use a class which has the connection object as attribute.

In addtion a @contextmanager could make the code more readable.

bernhardreiter commented 7 years ago

Considered a workaround to limit the time an apache WSGI process runs, to make the process self healing. A simple change to modify the 001-fody.conf to read

WSGIDaemonProcess www-fody threads=1 maximum-requests=10000 restart-interval=3600 graceful-timeout=30

does not work on Ubuntu 16.04LTS, because it would require at least http://modwsgi.readthedocs.io/en/latest/release-notes/version-4.5.12.html#new-features which we don't have.

bernhardreiter commented 5 years ago

For the evaluation of this issue: the operating instructions (currently part of the README.md) already state that the web server process has to be restarted if the database is to be restarted. Other cases so far were not reported in production setting. It is recommended to have log-monitoring in place for a fully automated system, so this should raise an alert in this (or other cases that are more likely).

ghost commented 5 years ago

It is recommended to have log-monitoring in place for a fully automated system, so this should raise an alert in this (or other cases that are more likely).

That need administrator attention and manual intervention. However, it can be handled automatically.

bernhardreiter commented 5 years ago

@wagner-certat we seem to agree that the current behavior can be improved. Which is the purpose of this issue after all. ;-) My remark was helping to prioritize it.

wagner-intevation commented 2 years ago

This is still an issue. If the DB was gone and is back up, the backend continues throwing 500s:

intelmq-fody-backend    | 172.28.0.1 - - [18/Jul/2022 14:05:19] "POST /api/login/ HTTP/1.1" 200 72
intelmq-fody-backend    | Traceback (most recent call last):
intelmq-fody-backend    |   File "/opt/intelmq-fody-backend/contactdb_api/contactdb_api/serve.py", line 949, in searchcidr
intelmq-fody-backend    |     query_results = __db_query_organisation_ids("""
intelmq-fody-backend    |   File "/opt/intelmq-fody-backend/contactdb_api/contactdb_api/serve.py", line 259, in __db_query_organisation_ids
intelmq-fody-backend    |     description, results = _db_query(operation_str.format(""), parameters)
intelmq-fody-backend    |   File "/opt/intelmq-fody-backend/contactdb_api/contactdb_api/serve.py", line 207, in _db_query
intelmq-fody-backend    |     cur = contactdb_conn.cursor(cursor_factory=RealDictCursor)
intelmq-fody-backend    | psycopg2.InterfaceError: connection already closed
intelmq-fody-backend    | 
intelmq-fody-backend    | During handling of the above exception, another exception occurred:
intelmq-fody-backend    | 
intelmq-fody-backend    | Traceback (most recent call last):
intelmq-fody-backend    |   File "/usr/lib/python3.8/wsgiref/handlers.py", line 137, in run
intelmq-fody-backend    |     self.result = application(self.environ, self.start_response)
intelmq-fody-backend    |   File "/usr/local/lib/python3.8/dist-packages/falcon/api.py", line 269, in __call__
intelmq-fody-backend    |     responder(req, resp, **params)
intelmq-fody-backend    |   File "/usr/local/lib/python3.8/dist-packages/hug/interface.py", line 947, in __call__
intelmq-fody-backend    |     raise exception
intelmq-fody-backend    |   File "/usr/local/lib/python3.8/dist-packages/hug/interface.py", line 918, in __call__
intelmq-fody-backend    |     self.call_function(input_parameters), context, request, response, **kwargs
intelmq-fody-backend    |   File "/usr/local/lib/python3.8/dist-packages/hug/interface.py", line 840, in call_function
intelmq-fody-backend    |     return self.interface(**parameters)
intelmq-fody-backend    |   File "/usr/local/lib/python3.8/dist-packages/hug/interface.py", line 129, in __call__
intelmq-fody-backend    |     return __hug_internal_self._function(*args, **kwargs)
intelmq-fody-backend    |   File "/opt/intelmq-fody-backend/contactdb_api/contactdb_api/serve.py", line 967, in searchcidr
intelmq-fody-backend    |     __commit_transaction()
intelmq-fody-backend    |   File "/opt/intelmq-fody-backend/contactdb_api/contactdb_api/serve.py", line 162, in __commit_transaction
intelmq-fody-backend    |     contactdb_conn.commit()
intelmq-fody-backend    | psycopg2.InterfaceError: connection already closed
wagner-intevation commented 1 month ago

This is what a DB restart (recovery after error in WAL) caused for the ticket api:

2024-07-29 13:45:46 [FALCON] [ERROR] GET /api/tickets/stats?sent-at_after=2024-07-29&sent-at_before=2024-07-30 => Traceback (most recent call last):, referer: https://automatix.cert-bund.de:8000/
  File "/usr/lib/python3/dist-packages/tickets_api/tickets_api/serve.py", line 818, in stats, referer: https://automatix.cert-bund.de:8000/
    results = query(prep), referer: https://automatix.cert-bund.de:8000/
  File "/usr/lib/python3/dist-packages/tickets_api/tickets_api/serve.py", line 558, in query, referer: https://automatix.cert-bund.de:8000/
    cur = eventdb_conn.cursor(cursor_factory=RealDictCursor), referer: https://automatix.cert-bund.de:8000/
psycopg2.InterfaceError: connection already closed, referer: https://automatix.cert-bund.de:8000/
, referer: https://automatix.cert-bund.de:8000/
During handling of the above exception, another exception occurred:, referer: https://automatix.cert-bund.de:8000/
, referer: https://automatix.cert-bund.de:8000/
Traceback (most recent call last):, referer: https://automatix.cert-bund.de:8000/
  File "falcon/app.py", line 361, in falcon.app.App.__call__, referer: https://automatix.cert-bund.de:8000/
  File "hug/interface.py", line 947, in hug.interface.HTTP.__call__, referer: https://automatix.cert-bund.de:8000/
  File "hug/interface.py", line 918, in hug.interface.HTTP.__call__, referer: https://automatix.cert-bund.de:8000/
  File "hug/interface.py", line 840, in hug.interface.HTTP.call_function, referer: https://automatix.cert-bund.de:8000/
  File "hug/interface.py", line 129, in hug.interface.Interfaces.__call__, referer: https://automatix.cert-bund.de:8000/
  File "/usr/lib/python3/dist-packages/tickets_api/tickets_api/serve.py", line 825, in stats, referer: https://automatix.cert-bund.de:8000/
    __rollback_transaction(), referer: https://automatix.cert-bund.de:8000/
  File "/usr/lib/python3/dist-packages/tickets_api/tickets_api/serve.py", line 133, in __rollback_transaction, referer: https://automatix.cert-bund.de:8000/
    eventdb_conn.rollback(), referer: https://automatix.cert-bund.de:8000/
psycopg2.InterfaceError: connection already closed, referer: https://automatix.cert-bund.de:8000/
, referer: https://automatix.cert-bund.de:8000/