microsoft / graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system
https://microsoft.github.io/graphrag/
MIT License
12.27k stars 1.02k forks source link

About language #371

Open mc112611 opened 2 weeks ago

mc112611 commented 2 weeks ago

what language does GraphRAG support? Can I use Japanese document to do RAG?

eyast commented 2 weeks ago

I believe so. GraphRAG makes calls to an OpenAI or AzureOpenAI LLM, and they support Japanese in their tokenizers. Keep in mind that the instructions (and examples) provided to the LLM to do entity extraction and summarization are all in English. You might want to modify these prompts.

For example, when you intiate the indexer, you will see prompts in prompts\entity_extraction.txt. Line 20 says "Return output in English". You might want to modify this so that the result is returned in Japanese.

mc112611 commented 2 weeks ago

I believe so. GraphRAG makes calls to an OpenAI or AzureOpenAI LLM, and they support Japanese in their tokenizers. Keep in mind that the instructions (and examples) provided to the LLM to do entity extraction and summarization are all in English. You might want to modify these prompts.

For example, when you intiate the indexer, you will see prompts in prompts\entity_extraction.txt. Line 20 says "Return output in English". You might want to modify this so that the result is returned in Japanese.

Thank you so much , hope this project will support local LLM, not only chatGPT。

jgbradley1 commented 2 weeks ago

Thank you so much , hope this project will support local LLM, not only chatGPT。

We welcome contributions from the community 😉 .