rag.insert() is so slow

HKUDS / LightRAG

"LightRAG: Simple and Fast Retrieval-Augmented Generation"

https://arxiv.org/abs/2410.05779

MIT License

9.22k stars 1.13k forks source link

rag.insert() is so slow #299

Open SLKAlgs opened 3 days ago

SLKAlgs commented 3 days ago

The rag.insert() process is too slow. Can I divide the document into multiple parts, specify different GPUs for insert in the same project, and finally build a knowledge graph about the complete document?

LarFii commented 3 days ago

This approach could lead to conflicts. However, one possible solution is to divide the document into multiple parts, cache all the extraction results from the LLMs for each part, and then merge the caches before performing the insertion. I haven’t tried this yet, but it should be theoretically feasible.

nihirv commented 1 day ago

To mitigate #315, I decided to modify my code to call rag.insert() on each page of a document (i.e. [rag.insert(page) for page in doc] vs rag.insert(doc)), and am also finding this method very slow