[Feature Request]: Improve the indexing time (create_community_report part)

microsoft / graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system

https://microsoft.github.io/graphrag/

MIT License

18.38k stars 1.79k forks source link

[Feature Request]: Improve the indexing time (create_community_report part) #746

Open anonymousz97 opened 2 months ago

anonymousz97 commented 2 months ago

Do you need to file an issue?

[X] I have searched the existing issues and this feature is not already filed.
[ ] My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
[X] I believe this is a legitimate feature request, not just a question. If this is a question, please use the Discussions area.

Is your feature request related to a problem? Please describe.

I'm trying to index around 100 docs contains from wiki. The create_community_report cost me around 10 hours. Is that normal? And when i check the GPU since i use the OSS model i realize that the GPU is not fully used or the code is not optimize to call full throughput of the server. How can i improve that? Thanks

Describe the solution you'd like

No response

Additional context

No response

natoverse commented 2 months ago

Community report creation can be an expensive process because it recursively generates summaries for every community in the Leiden hierarchy, and rolls them up as it approaches the root. If your model/API capacity is not being fully utilized you can tune the TPM/RPM/threading params to maximize usage. However, if you aren't using an OpenAI API-compliant model (which is sounds like if you are observing your GPU usage), we can't guarantee the supported parameters are all the same, so you may need to do some hunting to find the right way to tune your throughput. Have a look at #657 to see if others have a solution - there is quite a bit of discussion around optimizing throughput.

jwen6118 commented 2 months ago

It's very slow. Is there an optimization method？

anonymousz97 commented 2 months ago

It's very slow. Is there an optimization method？

Not yet, i think at the moment we can't decrease the time from this method since i rechecked the paper and the code, lots of LLM calls due to the naive algorithm (almost based on LLM). I think you can use small LLM to generate community report and let the LLM verify or you can believe in smaller model so it will reduce the time (or you have to pay for API so it will reduce the time. Another way is to find others host and create a script to convert back to OpenAI template as i do in my private server.