danny-avila / rag_api

ID-based RAG FastAPI: Integration with Langchain and PostgreSQL/pgvector
https://librechat.ai/
136 stars 61 forks source link

Embedding Operations Very Slow for 2MB CSV File #54

Open dkindlund opened 2 days ago

dkindlund commented 2 days ago

Hi @danny-avila , after configuring rag_api in a docker container, it seems as though when every file is submitted from LibreChat, rag_api processes the file into chunks and sequentially handles each chunk through the embedding API one-at-a-time. For a 2MB CSV file, that process is very, very slow.

I'm wondering if you've considered processing each chunk in batches, instead? Like, could we specify a rate limit of something like: process 10 chunks at a time?

Let me know your thoughts here.

As it stands, only small files can be handled by rag_api before the LibreChat file upload times out, because of these delays.

dkindlund commented 2 days ago

Oh, as a temp measure, @danny-avila -- maybe introduce a max file size limit field. That way, the rag_api can proactively reject files larger than X amount in size via LibreChat? (This is a precaution while larger file sizes are eventually supported.)