[Issue]: Can multiple model instances be called concurrently to construct a graph？

microsoft / graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system

https://microsoft.github.io/graphrag/

MIT License

20.17k stars 1.97k forks source link

[Issue]: Can multiple model instances be called concurrently to construct a graph？ #949

Closed c0derm4n closed 3 months ago

c0derm4n commented 3 months ago

Is there an existing issue for this?

[ ] I have searched the existing issues
[ ] I have checked #657 to validate if my issue is covered by community support

Describe the issue

The speed of constructing graphs is too slow now, especially when using larger local models

Steps to reproduce

No response

GraphRAG Config Used

# Paste your config here

Logs and screenshots

No response

Additional Information

GraphRAG Version:
Operating System:
Python Version:
Related Issues:

natoverse commented 3 months ago

We have a number of settings in GraphRAG to tune parallelization and token consumption, but they assume you are working with a single API endpoint. You can add llm config blocks to any step and pass in new config which may help, but more generally I think you're looking for a load-balancing system that spreads requests across multiple endpoints within a single verb. We don't plan to support this specifically because it is more of an infrastructure concern. I would suggest the use of a proxy or gateway that can do this without GraphRAG needing to be aware.