danswer-ai / danswer

Gen-AI Chat for Teams - Think ChatGPT if it had access to your team's unique knowledge.
https://docs.danswer.dev/
Other
9.77k stars 1.09k forks source link

Slack connector PostgreSQL connection closed after long indexing #1625

Open plopezamaya opened 3 weeks ago

plopezamaya commented 3 weeks ago

On version v3.0.75 :

While doing slack indexation with the slack connector, there is the following error being raised :

2024-06-10 15:36:50,541: INFO/MainProcess] Task check_for_document_sets_sync_task[32d2998b-3ac8-486a-836f-5f39332c29d0] succeeded in 0.04773302900002818s: None
06/10/2024 03:36:53 PM             utils.py  91 : [Attempt ID: 366] Slack call rate limited, retrying after 30 seconds. Exception: The request to the Slack API failed. (url: https://www.slack.com/api/conversations.list)
The server responded with: {'ok': False, 'error': 'ratelimited'}
[2024-06-10 15:36:54,748: INFO/MainProcess] Scheduler: Sending due task check-for-document-set-sync (check_for_document_sets_sync_task)
[2024-06-10 15:36:55,552: INFO/MainProcess] Task check_for_document_sets_sync_task[f9132d9e-9e61-4470-aa9e-7dcbe8feaf44] received
[2024-06-10 15:36:55,601: INFO/MainProcess] Task check_for_document_sets_sync_task[f9132d9e-9e61-4470-aa9e-7dcbe8feaf44] succeeded in 0.047569072999976925s: None
[2024-06-10 15:36:59,748: INFO/MainProcess] Scheduler: Sending due task check-for-document-set-sync (check_for_document_sets_sync_task)
[2024-06-10 15:37:00,613: INFO/MainProcess] Task check_for_document_sets_sync_task[4875b9db-b74a-4cef-a9fa-2fb8f156e925] received
[2024-06-10 15:37:00,662: INFO/MainProcess] Task check_for_document_sets_sync_task[4875b9db-b74a-4cef-a9fa-2fb8f156e925] succeeded in 0.048855686000024434s: None
[2024-06-10 15:37:04,748: INFO/MainProcess] Scheduler: Sending due task check-for-document-set-sync (check_for_document_sets_sync_task)
[2024-06-10 15:37:05,672: INFO/MainProcess] Task check_for_document_sets_sync_task[b3382356-73dd-4e41-950c-0e6be1dd14aa] received
[2024-06-10 15:37:05,721: INFO/MainProcess] Task check_for_document_sets_sync_task[b3382356-73dd-4e41-950c-0e6be1dd14aa] succeeded in 0.048227491000034206s: None
[2024-06-10 15:37:09,748: INFO/MainProcess] Scheduler: Sending due task check-for-document-set-sync (check_for_document_sets_sync_task)
[2024-06-10 15:37:10,732: INFO/MainProcess] Task check_for_document_sets_sync_task[7ef86c34-49e4-4802-8bdc-19ba7225b874] received
[2024-06-10 15:37:10,780: INFO/MainProcess] Task check_for_document_sets_sync_task[7ef86c34-49e4-4802-8bdc-19ba7225b874] succeeded in 0.04706027100019128s: None
[2024-06-10 15:37:14,778: INFO/MainProcess] Task check_for_document_sets_sync_task[ed9bab94-63b0-4079-923a-dbf681863aa7] received
[2024-06-10 15:37:14,829: INFO/MainProcess] Task check_for_document_sets_sync_task[ed9bab94-63b0-4079-923a-dbf681863aa7] succeeded in 0.05053329400016082s: None
[2024-06-10 15:37:14,748: INFO/MainProcess] Scheduler: Sending due task check-for-document-set-sync (check_for_document_sets_sync_task)
[2024-06-10 15:37:19,837: INFO/MainProcess] Task check_for_document_sets_sync_task[c3e0ad2b-abe1-452a-864d-71a1b180bf15] received
[2024-06-10 15:37:19,886: INFO/MainProcess] Task check_for_document_sets_sync_task[c3e0ad2b-abe1-452a-864d-71a1b180bf15] succeeded in 0.04762887500010038s: None
[2024-06-10 15:37:19,748: INFO/MainProcess] Scheduler: Sending due task check-for-document-set-sync (check_for_document_sets_sync_task)
[2024-06-10 15:37:24,897: INFO/MainProcess] Task check_for_document_sets_sync_task[c8d6fa4e-cf7d-418d-99d1-272f0b79ee1a] received
[2024-06-10 15:37:24,943: INFO/MainProcess] Task check_for_document_sets_sync_task[c8d6fa4e-cf7d-418d-99d1-272f0b79ee1a] succeeded in 0.04580861700014793s: None
[2024-06-10 15:37:24,748: INFO/MainProcess] Scheduler: Sending due task check-for-document-set-sync (check_for_document_sets_sync_task)
[2024-06-10 15:37:29,956: INFO/MainProcess] Task check_for_document_sets_sync_task[a6b7938e-f29f-4fa4-89a8-5bc8c692d3f5] received
[2024-06-10 15:37:29,748: INFO/MainProcess] Scheduler: Sending due task check-for-document-set-sync (check_for_document_sets_sync_task)
[2024-06-10 15:37:30,010: INFO/MainProcess] Task check_for_document_sets_sync_task[a6b7938e-f29f-4fa4-89a8-5bc8c692d3f5] succeeded in 0.05322852100016462s: None
           ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/loading.py", line 668, in load_on_pk_identity
    session.execute(
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 2232, in execute
    return self._execute_internal(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 2127, in _execute_internal
    result: Result[Any] = compile_state_cls.orm_execute_statement(
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/context.py", line 293, in orm_execute_statement
    result = conn.execute(
             ^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1413, in execute
    return meth(
           ^^^^^
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/sql/elements.py", line 483, in _execute_on_connection
    return connection._execute_clauseelement(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1637, in _execute_clauseelement
    ret = self._execute_context(
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1846, in _execute_context
    return self._exec_single_context(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1987, in _exec_single_context
    self._handle_dbapi_exception(
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2344, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1968, in _exec_single_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 920, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) SSL connection has been closed unexpectedly

[SQL: SELECT connector.id, connector.name, connector.source, connector.input_type, connector.connector_specific_config, connector.refresh_freq, connector.time_created, connector.time_updated, connector.disabled 
FROM connector 
WHERE connector.id = %(pk_1)s]
[parameters: {'pk_1': 72}]
(Background on this error at: https://sqlalche.me/e/20/e3q8)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/danswer/background/indexing/run_indexing.py", line 382, in run_indexing_entrypoint
    _run_indexing(db_session, attempt)
  File "/app/danswer/background/indexing/run_indexing.py", line 288, in _run_indexing
    mark_attempt_failed(
  File "/app/danswer/db/index_attempt.py", line 104, in mark_attempt_failed
    db_session.commit()
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 1906, in commit
    trans.commit(_to_root=True)
  File "<string>", line 2, in commit
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/state_changes.py", line 137, in _go
    ret_value = fn(self, *arg, **kw)
                ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 1221, in commit
    self._prepare_impl()
  File "<string>", line 2, in _prepare_impl
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/state_changes.py", line 137, in _go
    ret_value = fn(self, *arg, **kw)
                ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 1196, in _prepare_impl
    self.session.flush()
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 4154, in flush
    self._flush(objects)
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 4290, in _flush
    with util.safe_reraise():
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/util/langhelpers.py", line 147, in __exit__
    raise exc_value.with_traceback(exc_tb)
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 4251, in _flush
    flush_context.execute()
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/unitofwork.py", line 467, in execute
    rec.execute(self)
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/unitofwork.py", line 644, in execute
    util.preloaded.orm_persistence.save_obj(
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/persistence.py", line 85, in save_obj
    _emit_update_statements(
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/persistence.py", line 904, in _emit_update_statements
    c = connection.execute(
        ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1413, in execute
    return meth(
           ^^^^^
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/sql/elements.py", line 483, in _execute_on_connection
    return connection._execute_clauseelement(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1637, in _execute_clauseelement
    ret = self._execute_context(
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1810, in _execute_context
    conn = self._revalidate_connection()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 665, in _revalidate_connection
    self._invalid_transaction()
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 655, in _invalid_transaction
    raise exc.PendingRollbackError(
sqlalchemy.exc.PendingRollbackError: Can't reconnect until invalid transaction is rolled back.  Please rollback() fully before proceeding (Background on this error at: https://sqlalche.me/e/20/8s2b)

I did some investigation and the slack Problem actually seems to come from a connexion timeout on postgresql in the backend/danswer/background/indexing/run_indexing.py file, just after successfully preprared the first indexation batch db_session.refresh() :

doc_batch_generator, is_listing_complete = _get_document_generator(
            db_session=db_session,
            attempt=index_attempt,
            start_time=window_start,
            end_time=window_end,
        )

        try:
            all_connector_doc_ids: set[str] = set()
            for doc_batch in doc_batch_generator:
                # Check if connector is disabled mid run and stop if so unless it's the secondary
                # index being built. We want to populate it even for paused connectors
                # Often paused connectors are sources that aren't updated frequently but the
                # contents still need to be initially pulled.
                db_session.refresh(db_connector)
                if (
                    db_connector.disabled
                    and db_embedding_model.status != IndexModelStatus.FUTURE
                ):

The idea is to configure the session to stay alive as the main time is inside the slack connector, when retrieving the documents.

It is similar to this issue : https://github.com/danswer-ai/danswer/issues/1622