karthik-codex / Autogen_GraphRAG_Ollama

Microsoft's GraphRAG + AutoGen + Ollama + Chainlit = Fully Local & Free Multi-Agent RAG Superbot
470 stars 92 forks source link

error during the execution of the graphrag after following the steps #7

Open VidhyaVarshanyJS opened 3 months ago

VidhyaVarshanyJS commented 3 months ago

❌ create_summarized_entities None ⠼ GraphRAG Indexer ├── Loading Input (text) - 1 files loaded (0 filtered) ━━━━ 100% 0:0… 0:0… ├── create_base_text_units ├── create_base_extracted_entities └── create_summarized_entities ❌ Errors occurred during the pipeline run, see logs for more details.

karthik-codex commented 3 months ago

have you looked at your log file? what models are you using in your settings.yaml?

VidhyaVarshanyJS commented 2 months ago

Mistral and nomic embed text only

On Sun, Jul 21, 2024, 7:46 PM Karthik Rajan @.***> wrote:

have you looked at your log file? what models are you using in your settings.yaml?

— Reply to this email directly, view it on GitHub https://github.com/karthik-codex/Autogen_GraphRAG_Ollama/issues/7#issuecomment-2241626570, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASIXPTDDI7S23XGILM4JLG3ZNO7CLAVCNFSM6AAAAABLGPPPRKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBRGYZDMNJXGA . You are receiving this because you authored the thread.Message ID: @.***>

VidhyaVarshanyJS commented 2 months ago

Hi, there

I am running the model in the low specs machine .I am doing a prroject on the graphrag for which i need to index the below book related to medicine.Can you help me to generate the graphrag indexer files alone and attach me the zip file with the gmail?

https://drive.google.com/file/d/0B7HZIUBvCH1EZ1REYVFnYjZscTQ/view?resourcekey=0-eZTYSn0ue4iU8l20-TdOrQ

karthik-codex commented 2 months ago

Hi Vidhya, sorry I do not have the resources to perform the indexing for 900 pages. I would recommend converting the pdf to markdown first using the script provided in /Utils folder. Then split the markdown into ~20-30 chunks and do the indexing sequentially.

VidhyaVarshanyJS commented 2 months ago

I am new to this .. Is it possible to run the ollama local model?

On Mon, Jul 22, 2024, 7:28 PM Karthik Rajan @.***> wrote:

Hi Vidhya, sorry I do not have the resources to perform the indexing for 900 pages. I would recommend converting the pdf to markdown first using the script provided in /Utils folder. Then split the markdown into ~20-30 chunks and do the indexing sequentially.

— Reply to this email directly, view it on GitHub https://github.com/karthik-codex/Autogen_GraphRAG_Ollama/issues/7#issuecomment-2243034106, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASIXPTE6NMH4QS4CEXCWMFTZNUFYNAVCNFSM6AAAAABLGPPPRKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBTGAZTIMJQGY . You are receiving this because you authored the thread.Message ID: @.***>

karthik-codex commented 2 months ago

I am new to this .. Is it possible to run the ollama local model?

Yes. Download and install the tool from Ollama.com. Then open cmd prompt and execute "ollama run llama3" to run Llama3 locally.

VidhyaVarshanyJS commented 2 months ago

The api embedding call is not working. whenever i start to index the documents using graphrag. It shows the error while creating the entities relationships. when i looked into the log file it shows like "Error invoking the LLM". I followed the steps as such replacing the embedding.py and openai_llm_embedding.py file in the graphrag from the utils folder despite it shows the error message. Can you help..?

On Wed, Jul 24, 2024 at 11:08 PM Karthik Rajan @.***> wrote:

I am new to this .. Is it possible to run the ollama local model?

Yes. Download and install the tool from Ollama.com. Then open cmd prompt and execute "ollama run llama3" to run Llama3 locally.

— Reply to this email directly, view it on GitHub https://github.com/karthik-codex/Autogen_GraphRAG_Ollama/issues/7#issuecomment-2248566441, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASIXPTGWXR4BEELBUDNWKHTZN7RAPAVCNFSM6AAAAABLGPPPRKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBYGU3DMNBUGE . You are receiving this because you authored the thread.Message ID: @.***>

0xf179ed0e commented 2 months ago

I had the same error. I had to change my settings.json file.

following lines were wrong:

model: nomic_embed_text #text-embedding-3-large  #mxbai-embed-large #
api_base: http://localhost:11434/api

... I had to change to below:

model: nomic-embed-text #text-embedding-3-large  #mxbai-embed-large #
api_base: http://localhost:11434/v1
karthik-codex commented 2 months ago

I may have had similar issue. I think I had to revert back to nomic_embed_text and 11434/api duing local search inference. Let me know which one works.

VidhyaVarshanyJS commented 2 months ago

Ok, I'll check.

On Thu, Aug 1, 2024 at 7:11 PM Karthik Rajan @.***> wrote:

I may have had similar issue. I think I had to revert back to nomic_embed_text and 11434/api duing local search inference. Let me know which one works.

— Reply to this email directly, view it on GitHub https://github.com/karthik-codex/Autogen_GraphRAG_Ollama/issues/7#issuecomment-2263082890, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASIXPTB35E5VE2JHKN3H4TTZPI3G7AVCNFSM6AAAAABLGPPPRKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENRTGA4DEOBZGA . You are receiving this because you authored the thread.Message ID: @.***>

VidhyaVarshanyJS commented 2 months ago

How can I evaluate the graphrag performance if ask out of bound questions??? like if i am handling any medical related data and building a graphrag then if user asks query that is out of the context infromation like for example if the provided corpus had protien A cause Disease X and protein B causes Disease Y then if the user query is like what disease is cause with protein A and protein B to the graphrag llm then it will answer it causes this disease C kinda like that,.so while specifying any disease . How can i trust the answer that the graphrag is providing is correct ? how can i evaluate it ?

hongbo-sun commented 2 months ago

❌ create_summarized_entities None ⠼ GraphRAG Indexer ├── Loading Input (text) - 1 files loaded (0 filtered) ━━━━ 100% 0:0… 0:0… ├── create_base_text_units ├── create_base_extracted_entities └── create_summarized_entities ❌ Errors occurred during the pipeline run, see logs for more details.

It may be caused by no response of LLM. I change the max tries as follows and it works.

encoding_model: cl100k_base skip_workflows: [] llm: api_key: ${GRAPHRAG_API_KEY} type: openai_chat # or azure_openai_chat model: llama3 model_supports_json: true # recommended if this is available for your model. api_base: http://localhost:11434/v1 max_retries: 1000