infiniflow / ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
https://ragflow.io
Apache License 2.0
23.05k stars 2.25k forks source link

Chat assistant response slow #2065

Open IamHarri opened 2 months ago

IamHarri commented 2 months ago

Describe your problem

I deploy ragflow (infiniflow/ragflow:v0.9.0) on aws eks. I have two nodes to run all the dependencies ( redis, mysql, minio, elasticsearch) Nodes detail: RAM: 64 GB CPU: 8 GPU: 0 Disk: 100 GB Physical Processor: Intel Xeon Platinum 8175:

Locally host ollama which has llama3:latest as chat model and mxbai-embed-large as embeded model. I have one node for it Nodes detail: RAM: 64 GB CPU: 8 GPU: 0 Disk: 100 GB Physical Processor: Intel Xeon 8375C (Ice Lake) GHZ: 3.5

The parsing document workwell, It tooks around 5 mins for large document. but the chat assistant is very slow, I only say "hi" and it tooks 1 mins to search and reponse to me. Do you have any idea why It is slow like that, even my compute resource is big ?s

KevinHuSh commented 2 months ago

It might be caused by searching from ES whoes performance is highly related to RAM and docs it indexed. One PDF can generate thousands of docs into ES.

IamHarri commented 2 months ago

I have check ES resource consumption, It uses only 8GB of RAM. It still has a lots of available resources. Do you have any idea to optimize the performance of the assistant in my case (one simple question takes 2 mins and complex questions do not reponse )