truefoundry / cognita

RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry
https://cognita.truefoundry.com
Apache License 2.0
3.32k stars 274 forks source link

Event loop closed while updating indexing status #330

Closed jalotra closed 2 months ago

jalotra commented 2 months ago

hi dev, I am not able to understand the indexing flow, any reasoning why fastApi's process pool is re-used ? these are logs :

cognita-backend   | DEBUG:    2024-09-08 12:43:44,504 - indexer:ingest_data:342 - Starting ingestion for data source fqn: localdir::/app/user_data/law-pdf
cognita-backend   | INFO:     192.168.65.1:33019 - "POST /v1/collections/ingest HTTP/1.1" 201 Created
cognita-backend   | INFO:     192.168.65.1:33019 - "GET /v1/collections/law-pdf-machine HTTP/1.1" 200 OK
cognita-backend   | INFO:     192.168.65.1:33019 - "POST /v1/collections/data_ingestion_runs/list HTTP/1.1" 200 OK
cognita-backend   | ERROR:    2024-09-08 12:43:44,617 - prismastore:aupdate_data_ingestion_run_status:551 - Failed to update data ingestion run status: Event loop is closed
cognita-backend   | Traceback (most recent call last):
cognita-backend   |   File "/app/backend/modules/metadata_store/prismastore.py", line 538, in aupdate_data_ingestion_run_status
cognita-backend   |     ] = await self.db.ingestionruns.update(

when I look at what /v1/collections/ingest does is this :

try:
        process_pool = request.app.state.process_pool
    except AttributeError:
        process_pool = None

The issue I think looks like this :

  1. If anything breaks in previous runs, the event loop breaks out and then successive runs are not possible

stack :

cognita-backend   |   File "/virtualenvs/venv/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 216, in handle_async_request
cognita-backend   |     raise exc from None
cognita-backend   |   File "/virtualenvs/venv/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 189, in handle_async_request
cognita-backend   |     await self._close_connections(closing)
cognita-backend   |   File "/virtualenvs/venv/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 305, in _close_connections
cognita-backend   |     await connection.aclose()
cognita-backend   |   File "/virtualenvs/venv/lib/python3.11/site-packages/httpcore/_async/connection.py", line 171, in aclose
cognita-backend   |     await self._connection.aclose()
cognita-backend   |   File "/virtualenvs/venv/lib/python3.11/site-packages/httpcore/_async/http11.py", line 265, in aclose
cognita-backend   |     await self._network_stream.aclose()
cognita-backend   |   File "/virtualenvs/venv/lib/python3.11/site-packages/httpcore/_backends/anyio.py", line 55, in aclose
cognita-backend   |     await self._stream.aclose()
cognita-backend   |   File "/virtualenvs/venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 1202, in aclose
cognita-backend   |     self._transport.close()
cognita-backend   |   File "/usr/local/lib/python3.11/asyncio/selector_events.py", line 864, in close
cognita-backend   |     self._loop.call_soon(self._call_connection_lost, None)
cognita-backend   |   File "/usr/local/lib/python3.11/asyncio/base_events.py", line 762, in call_soon
cognita-backend   |     self._check_closed()
cognita-backend   |   File "/usr/local/lib/python3.11/asyncio/base_events.py", line 520, in _check_closed
cognita-backend   |     raise RuntimeError('Event loop is closed')
chiragjn commented 2 months ago

We are not using FastAPI's pool, the pool attached to request.app.state is created in app init. Event loop closing is unexpected. We'll try and reproduce and fix this