Closed jalotra closed 2 months ago
hi dev, I am not able to understand the indexing flow, any reasoning why fastApi's process pool is re-used ? these are logs :
cognita-backend | DEBUG: 2024-09-08 12:43:44,504 - indexer:ingest_data:342 - Starting ingestion for data source fqn: localdir::/app/user_data/law-pdf cognita-backend | INFO: 192.168.65.1:33019 - "POST /v1/collections/ingest HTTP/1.1" 201 Created cognita-backend | INFO: 192.168.65.1:33019 - "GET /v1/collections/law-pdf-machine HTTP/1.1" 200 OK cognita-backend | INFO: 192.168.65.1:33019 - "POST /v1/collections/data_ingestion_runs/list HTTP/1.1" 200 OK cognita-backend | ERROR: 2024-09-08 12:43:44,617 - prismastore:aupdate_data_ingestion_run_status:551 - Failed to update data ingestion run status: Event loop is closed cognita-backend | Traceback (most recent call last): cognita-backend | File "/app/backend/modules/metadata_store/prismastore.py", line 538, in aupdate_data_ingestion_run_status cognita-backend | ] = await self.db.ingestionruns.update(
when I look at what /v1/collections/ingest does is this :
/v1/collections/ingest
try: process_pool = request.app.state.process_pool except AttributeError: process_pool = None
The issue I think looks like this :
breaks out
stack :
cognita-backend | File "/virtualenvs/venv/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 216, in handle_async_request cognita-backend | raise exc from None cognita-backend | File "/virtualenvs/venv/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 189, in handle_async_request cognita-backend | await self._close_connections(closing) cognita-backend | File "/virtualenvs/venv/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 305, in _close_connections cognita-backend | await connection.aclose() cognita-backend | File "/virtualenvs/venv/lib/python3.11/site-packages/httpcore/_async/connection.py", line 171, in aclose cognita-backend | await self._connection.aclose() cognita-backend | File "/virtualenvs/venv/lib/python3.11/site-packages/httpcore/_async/http11.py", line 265, in aclose cognita-backend | await self._network_stream.aclose() cognita-backend | File "/virtualenvs/venv/lib/python3.11/site-packages/httpcore/_backends/anyio.py", line 55, in aclose cognita-backend | await self._stream.aclose() cognita-backend | File "/virtualenvs/venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 1202, in aclose cognita-backend | self._transport.close() cognita-backend | File "/usr/local/lib/python3.11/asyncio/selector_events.py", line 864, in close cognita-backend | self._loop.call_soon(self._call_connection_lost, None) cognita-backend | File "/usr/local/lib/python3.11/asyncio/base_events.py", line 762, in call_soon cognita-backend | self._check_closed() cognita-backend | File "/usr/local/lib/python3.11/asyncio/base_events.py", line 520, in _check_closed cognita-backend | raise RuntimeError('Event loop is closed')
We are not using FastAPI's pool, the pool attached to request.app.state is created in app init. Event loop closing is unexpected. We'll try and reproduce and fix this
request.app.state
hi dev, I am not able to understand the indexing flow, any reasoning why fastApi's process pool is re-used ? these are logs :
when I look at what
/v1/collections/ingest
does is this :The issue I think looks like this :
breaks out
and then successive runs are not possiblestack :