Issue with graphrag.index when cache and output directories are removed

yakeworld commented 4 months ago

The command python -m graphrag.index --root ./ragtest runs successfully when the cache and output directories already exist.

However, after deleting the cache and output directories and running the command again:

python -m graphrag.index --root ./ragtest

The following error occurs:

create_base_extracted_entities
                                        entity_graph
0  <graphml xmlns="http://graphml.graphdrawing.or...
🚀 create_summarized_entities
                                        entity_graph
0  <graphml xmlns="http://graphml.graphdrawing.or...
❌ create_final_entities
None
⠙ GraphRAG Indexer 
├── Loading Input (InputFileType.text) - 1 files loaded (1 filtered) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_base_text_units
├── create_base_extracted_entities
├── create_summarized_entities
└── create_base_entity_graph
❌ Errors occurred during the pipeline run, see logs for more details.

This issue is primarily due to a bug in the Ollama embedding service. To resolve this, we need to use an alternative embedding service.

yakeworld commented 4 months ago

python -m graphrag.query --root ./ragtest --method local "What is microscopy ?"

INFO: Reading settings from ragtest/settings.yaml creating llm client with model: llama3 creating embedding llm client with model: nomic-embed-text:latest ERROR:root:Error getting OpenAI compatible embedding: 404 page not found ERROR:root:Failed to generate embedding ERROR:root:Failed to generate embedding for query

severian42 commented 4 months ago

Thanks for sharing this! I'm trying to find a solid workaround but it seems like I have to move away from Ollama until their embedding becomes more compatible and robust. I am moving to a more API-centric design for the app so hopefully we can solve the issue through those means. I'll keep diving into this to see what I can do

severian42 / GraphRAG-Local-UI

Issue with graphrag.index when cache and output directories are removed #54