microsoft / graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system
https://microsoft.github.io/graphrag/
MIT License
16.5k stars 1.55k forks source link

[Bug]: Error executing verb "cluster_graph" in create_base_entity_graph: EmptyNetworkError details=None #562

Closed niubiqianshui closed 1 month ago

niubiqianshui commented 1 month ago

Describe the bug

06:49:32,683 graphrag.index.reporting.file_workflow_callbacks INFO Error executing verb "cluster_graph" in create_base_entity_graph: EmptyNetworkError details=None 06:49:32,684 graphrag.index.run ERROR error running workflow create_base_entity_graph Traceback (most recent call last): File "/usr/local/lib/python3.10/site-packages/graphrag/index/run.py", line 323, in run_pipeline result = await workflow.run(context, callbacks) File "/usr/local/lib/python3.10/site-packages/datashaper/workflow/workflow.py", line 369, in run timing = await self._execute_verb(node, context, callbacks) File "/usr/local/lib/python3.10/site-packages/datashaper/workflow/workflow.py", line 410, in _execute_verb result = node.verb.func(**verb_args) File "/usr/local/lib/python3.10/site-packages/graphrag/index/verbs/graph/clustering/cluster_graph.py", line 61, in cluster_graph results = output_df[column].apply(lambda graph: run_layout(strategy, graph)) File "/usr/local/lib/python3.10/site-packages/pandas/core/series.py", line 4924, in apply ).apply() File "/usr/local/lib/python3.10/site-packages/pandas/core/apply.py", line 1427, in apply return self.apply_standard() File "/usr/local/lib/python3.10/site-packages/pandas/core/apply.py", line 1507, in apply_standard mapped = obj._map_values( File "/usr/local/lib/python3.10/site-packages/pandas/core/base.py", line 921, in _map_values return algorithms.map_array(arr, mapper, na_action=na_action, convert=convert) File "/usr/local/lib/python3.10/site-packages/pandas/core/algorithms.py", line 1743, in map_array return lib.map_infer(values, mapper, convert=convert) File "lib.pyx", line 2972, in pandas._libs.lib.map_infer File "/usr/local/lib/python3.10/site-packages/graphrag/index/verbs/graph/clustering/cluster_graph.py", line 61, in results = output_df[column].apply(lambda graph: run_layout(strategy, graph)) File "/usr/local/lib/python3.10/site-packages/graphrag/index/verbs/graph/clustering/cluster_graph.py", line 167, in run_layout clusters = run_leiden(graph, strategy) File "/usr/local/lib/python3.10/site-packages/graphrag/index/verbs/graph/clustering/strategies/leiden.py", line 26, in run node_id_to_community_map = _compute_leiden_communities( File "/usr/local/lib/python3.10/site-packages/graphrag/index/verbs/graph/clustering/strategies/leiden.py", line 61, in _compute_leiden_communities community_mapping = hierarchical_leiden( File "<@beartype(graspologic.partition.leiden.hierarchical_leiden) at 0x7fb32d90b6d0>", line 304, in hierarchical_leiden File "/usr/local/lib/python3.10/site-packages/graspologic/partition/leiden.py", line 588, in hierarchical_leiden hierarchical_clusters_native = gn.hierarchical_leiden( leiden.EmptyNetworkError: EmptyNetworkError 06:49:32,685 graphrag.index.reporting.file_workflow_callbacks INFO Error running pipeline! details=None

Steps to reproduce

I have configured the local model using ollama and lm studio gemma2:latest and nomic-embed-text-v1.5.Q5_K_M

settings.yaml llm: api_key: ollama type: openai_chat model: gemma2:latest model_supports_json: true api_base: http://192.168.1.107:11434/v1

parallelization: stagger: 0.3

async_mode: threaded

embeddings:

async_mode: threaded llm: api_key: lm-studio type: openai_embedding model: TheBloke/nomic-emb-GGUF/nomic-embed-text-v1.5.Q5_K_M.gguf api_base: http://192.168.1.24:1234/v1

image image

Expected Behavior

No response

GraphRAG Config Used

No response

Logs and screenshots

No response

Additional Information

caifanfan commented 1 month ago

wanglufei1 commented 1 month ago

I'm having the same problem. Is this solved or how to avoid it?

niubiqianshui commented 1 month ago

I'm having the same problem. Is this solved or how to avoid it?

no

BeginningOne commented 1 month ago

same problem executing verb "cluster_graph" in create_base_entity_graph: Columns must be same length as key

image

smalldeer1982 commented 1 month ago

same

Lincolnwill commented 1 month ago

maybe its llm models problem, qwen2:1.5b appear the same issue, but when using mistral:latest, it works

boxter007 commented 1 month ago

same

guanghuizhao commented 1 month ago

same

Ikaros-521 commented 1 month ago

same +1

Ikaros-521 commented 1 month ago

maybe its llm models problem, qwen2:1.5b appear the same issue, but when using mistral:latest, it works也许是它的llm型号问题,qwen2:1.5b出现了同样的问题,但当使用mistral:latest时,它工作正常

I replaced gemma2 and ran it a few more steps, but still have the same problem

mistral is useful

natoverse commented 1 month ago

Consolidating alternate model issues here: #657

linh31332 commented 1 month ago

I got the same error after use the prompt tune,and found the new prompt have some issues(eg. lost of {record_delimiter}, wrong format... ), modify manually according to the init prompt and the problem solved.

niubiqianshui commented 1 month ago

I got the same error after use the prompt tune,and found the new prompt have some issues(eg. lost of {record_delimiter}, wrong format... ), modify manually according to the init prompt and the problem solved.

I don't quite understand, can you explain it in detail?

linh31332 commented 1 month ago

I got the same error after use the prompt tune,and found the new prompt have some issues(eg. lost of {record_delimiter}, wrong format... ), modify manually according to the init prompt and the problem solved.

I don't quite understand, can you explain it in detail?

I mean if use the graphrag.promt_tune to generate new prompt or use manual prompt, the new entity extraction prompt may have some issue, such as lack of tuple delimeter or record delimeter,wrong format output in the example..., this may cause the entity extraction result in wrong format. So maybe you can check the entity extraction prompt, especially the output format in example. I think the entity output format should be same as the initial prompt.

Upcreat commented 1 month ago

I have the same problem

tigerzhou-cool commented 1 month ago

I had the same problem but it was solved, My plan is ollama + graphrag,Simply changing the llm and embedding in the settings.yaml is not sufficient。like this: entity_extraction: prompt: "prompts/entity_extraction.txt" entity_types: [organization,person,geo,event] max_gleanings: 0 #here claim_extraction: prompt: "prompts/claim_extraction.txt" description: "Any claims or facts that could be relevant to information discovery." max_gleanings: 0 # here it works。

Dr-jw commented 1 month ago

I had the same problem but it was solved, My plan is ollama + graphrag,Simply changing the llm and embedding in the settings.yaml is not sufficient。like this: entity_extraction: prompt: "prompts/entity_extraction.txt" entity_types: [organization,person,geo,event] max_gleanings: 0 #here claim_extraction: prompt: "prompts/claim_extraction.txt" description: "Any claims or facts that could be relevant to information discovery." max_gleanings: 0 # here it works。

how to say? 细说一下

dar4545 commented 1 month ago

I got the same error after use the prompt tune,and found the new prompt have some issues(eg. lost of {record_delimiter}, wrong format... ), modify manually according to the init prompt and the problem solved.

I don't quite understand, can you explain it in detail?

I mean if use the graphrag.promt_tune to generate new prompt or use manual prompt, the new entity extraction prompt may have some issue, such as lack of tuple delimeter or record delimeter,wrong format output in the example..., this may cause the entity extraction result in wrong format. So maybe you can check the entity extraction prompt, especially the output format in example. I think the entity output format should be same as the initial prompt.

this is exactly my case. Thank you for your sharing.