truefoundry / cognita

RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry
https://cognita.truefoundry.com
Apache License 2.0
3.15k stars 251 forks source link

The backend API (http://localhost:8000) stop responding with the error "upstream request timeout" #305

Open clive97 opened 4 weeks ago

clive97 commented 4 weeks ago

Hi,

I'm running Cognita with the latest commit on my local machine using Docker Compose. Everything appears to be functioning normally, with no obvious errors in the logs. I was able to create a Data Source and successfully upload PDF files to the backend. However, when I attempt to create a new collection, the API at http://localhost:8000 stops responding and returns an "upstream request timeout" error. The API only resumes functioning once the backend has finished parsing the files and completes the collection creation process.

While this isn't a major issue in a testing environment, where I can do other tasks while waiting, it becomes problematic in a production environment. Multiple users might be accessing the portal simultaneously to perform different tasks, and when the API freezes, much of the portal becomes unusable, as most of its features rely on the APIs.

Additionally, when attempting to parse a larger batch of files, the process takes so long that it fails the health check, causing the orchestrator (e.g. AKS, ECS, Kubernetes) to restart the container.

Is this behavior expected, or is there something I might have missed?

Thanks, Clive

chiragjn commented 3 weeks ago

Hey Clive, thanks for reporting this. This is strange, the backend server should still keep working even if there is something processing. It is possible that recent changes have started blocking the event loop. We'll take a look and plan changes for this

There are a bunch of call sites that are sync and it would be ideal to move the parsing to starlette's background worker

chiragjn commented 3 days ago

Addressed by #321

clive97 commented 3 days ago

Thank you for the follow up. It fixed now.

-Clive