Azure-Samples / azure-search-openai-demo

A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.
https://azure.microsoft.com/products/search
MIT License
6.06k stars 4.15k forks source link

GraphRAG support #1928

Open DuboisABB opened 2 months ago

DuboisABB commented 2 months ago

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [X] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

I just recently learned about a new RAG approach called GraphRAG, which shows impressive improvements over plain RAG: https://data.world/blog/generative-ai-benchmark-increasing-the-accuracy-of-llms-in-the-enterprise-with-a-knowledge-graph/

Microsoft already has an accelerator project here: https://github.com/Azure-Samples/graphrag-accelerator

It would be nice if this new RAG approach could be integrated into this project, although I'm not sure if it's mature enough yet.

pamelafox commented 2 months ago

Yes, we've seen that and even tinkered with it a bit, back before it was public. It'd help to know if developers here have tried it and seen quality improvements. If anyone else has, please comment and share!

pamelafox commented 2 months ago

Related: There's a session on GraphRAG at our upcoming RAGHack: https://reactor.microsoft.com/reactor/events/23432/

DuboisABB commented 2 months ago

Thanks, registered.

Lawndemon commented 2 months ago

I've got the GraphRAG project working but have not merged it in with this project. Out of the gate, the responses seem much deeper and more concise. That being said, turning on reranking with the vector similarity model is still very strong. GraphRAG also forces us to use UTF-8 encoding which is a big change if you're using the computer vision version of this project.

DuboisABB commented 2 months ago

Interesting, thanks for your comment. From what I see, the cost of building the graph is really the showstopper. In the first release of our app, we have about 10,000 documents in the index (~394k chunks), and we plan on adding even more documents. GPT 4o mini seems to be the cheapest option but still, from the estimates I saw, it would still be thousands to build the graph. How was the cost for you?

nickmachairas commented 4 days ago

Indeed, cost to build the knowledge graph is high. Even more so given that, at the moment, it appears that the knowledge graph needs to be rebuilt from scratch when adding new documents.