microsoft / graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system
https://microsoft.github.io/graphrag/
MIT License
15.17k stars 1.38k forks source link

[Bug]: <title>โŒ Errors occurred during the pipeline run, see logs for more details. #485

Closed shreyn07 closed 1 month ago

shreyn07 commented 1 month ago

Describe the bug

11:17:11,575 graphrag.index.run ERROR error running workflow create_final_community_reports Traceback (most recent call last): File "C:\Users\shrnema\AppData\Local\Programs\Python\Python311\Lib\site-packages\graphrag\index\run.py", line 323, in run_pipeline result = await workflow.run(context, callbacks) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\shrnema\AppData\Local\Programs\Python\Python311\Lib\site-packages\datashaper\workflow\workflow.py", line 369, in run timing = await self._execute_verb(node, context, callbacks) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\shrnema\AppData\Local\Programs\Python\Python311\Lib\site-packages\datashaper\workflow\workflow.py", line 410, in _execute_verb result = node.verb.func(**verb_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\shrnema\AppData\Local\Programs\Python\Python311\Lib\site-packages\datashaper\engine\verbs\window.py", line 73, in window window = __window_function_mapwindow_operation


File "C:\Users\shrnema\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\frame.py", line 4102, in getitem
indexer = self.columns.get_loc(key)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\shrnema\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\indexes\range.py", line 417, in get_loc
raise KeyError(key)
KeyError: 'community'
11:17:11,578 graphrag.index.reporting.file_workflow_callbacks INFO Error running pipeline! details=None

### Steps to reproduce

_No response_

### Expected Behavior

_No response_

### GraphRAG Config Used

_No response_

### Logs and screenshots

_No response_

### Additional Information

- GraphRAG Version:
- Operating System:
- Python Version:
- Related Issues:
cdg1921 commented 1 month ago

The same issue, and here is the logs:

{'data': 'Error running pipeline!', 'details': 'null', 'source': "'community'", 'stack': 'Traceback (most recent call last):\n' ' File ' '"/root/miniconda3/envs/graphrag/lib/python3.11/site-packages/graphrag/index/run.py", ' 'line 323, in run_pipeline\n' ' result = await workflow.run(context, callbacks)\n' ' ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n' ' File ' '"/root/miniconda3/envs/graphrag/lib/python3.11/site-packages/datashaper/workflow/workflow.py", ' 'line 369, in run\n' ' timing = await self._execute_verb(node, context, callbacks)\n' ' ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n' ' File ' '"/root/miniconda3/envs/graphrag/lib/python3.11/site-packages/datashaper/workflow/workflow.py", ' 'line 410, in _execute_verb\n' ' result = node.verb.func(**verb_args)\n' ' ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n' ' File ' '"/root/miniconda3/envs/graphrag/lib/python3.11/site-packages/datashaper/engine/verbs/window.py", ' 'line 73, in window\n' ' window = ' 'window_function_mapwindow_operation\n' ' ' '~~~^^^^^^^^\n' ' File ' '"/root/miniconda3/envs/graphrag/lib/python3.11/site-packages/pandas/core/frame.py", ' 'line 4102, in getitem__\n' ' indexer = self.columns.get_loc(key)\n' ' ^^^^^^^^^^^^^^^^^^^^^^^^^\n' ' File ' '"/root/miniconda3/envs/graphrag/lib/python3.11/site-packages/pandas/core/indexes/range.py", ' 'line 417, in get_loc\n' ' raise KeyError(key)\n' "KeyError: 'community'\n", 'type': 'error'}

SeanFeng91 commented 1 month ago

same issue: [226 rows x 5 columns] ๐Ÿš€ create_base_extracted_entities entity_graph 0 <graphml xmlns="http://graphml.graphdrawing.or... ๐Ÿš€ create_summarized_entities entity_graph
0 <graphml xmlns="http://graphml.graphdrawing.or...
โŒ create_base_entity_graph None โ ง GraphRAG Indexer โ”œโ”€โ”€ Loading Input (InputFileType.text) - 1 files loaded (0 filtered) โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 100% 0:00:00 0:00:00 โ”œโ”€โ”€ create_base_text_units โ”œโ”€โ”€ create_base_extracted_entities โ”œโ”€โ”€ create_summarized_entities โ””โ”€โ”€ create_base_entity_graph โŒ Errors occurred during the pipeline run, see logs for more details.

ricardowu1112 commented 1 month ago

same error

{"type": "error", "data": "Error executing verb \"window\" in create_final_community_reports: 'community'", "stack": "Traceback (most recent call last):\n File \"/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/datashaper/workflow/workflow.py\", line 410, in _execute_verb\n result = node.verb.func(**verb_args)\n File \"/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/datashaper/engine/verbs/window.py\", line 73, in window\n window = __window_function_map[window_operation](input_table[column])\n File \"/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/frame.py\", line 4102, in __getitem__\n indexer = self.columns.get_loc(key)\n File \"/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/indexes/range.py\", line 417, in get_loc\n raise KeyError(key)\nKeyError: 'community'\n", "source": "'community'", "details": null} {"type": "error", "data": "Error running pipeline!", "stack": "Traceback (most recent call last):\n File \"/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/graphrag/index/run.py\", line 323, in run_pipeline\n result = await workflow.run(context, callbacks)\n File \"/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/datashaper/workflow/workflow.py\", line 369, in run\n timing = await self._execute_verb(node, context, callbacks)\n File \"/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/datashaper/workflow/workflow.py\", line 410, in _execute_verb\n result = node.verb.func(**verb_args)\n File \"/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/datashaper/engine/verbs/window.py\", line 73, in window\n window = __window_function_map[window_operation](input_table[column])\n File \"/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/frame.py\", line 4102, in __getitem__\n indexer = self.columns.get_loc(key)\n File \"/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/indexes/range.py\", line 417, in get_loc\n raise KeyError(key)\nKeyError: 'community'\n", "source": "'community'", "details": null}

amitguptadumka commented 1 month ago

File "/opt/anaconda3/envs/graphrag/lib/python3.11/site-packages/pandas/core/indexes/range.py", line 417, in get_loc raise KeyError(key) KeyError: 'community' 00:53:41,724 graphrag.index.reporting.file_workflow_callbacks INFO Error running pipeline! details=None Same error, is there a support for this??

ayushjadia commented 1 month ago

Are you all guuys using Azure or colab

sadimoodi commented 1 month ago

same error

shreyn07 commented 1 month ago

In settings.yml. Change comment out the line model supports json it will work.

sadimoodi commented 1 month ago

In settings.yml. Change comment out the line model supports json it will work.

change it to what? to False? i already did that and it doesnt work

shreyn07 commented 1 month ago

Just comment that line

amitguptadumka commented 4 weeks ago

its not working @shreyn07 . Can you paste code snippet.

ghwang1999 commented 4 weeks ago

same error, please save me. TAT

ghwang1999 commented 4 weeks ago

cmd: return bound(*args, **kwds) ๐Ÿš€ create_base_text_units id ... n_tokens 0 39ac36c36504c6e966e37cefb41d1168 ... 100 1 0cc4a30f79ae4bdbddd35f6ca8f7fe55 ... 100 2 6b56a4b6d0f57039349bcc686254e2d2 ... 100 3 19d681644edc65f5961e91f3ea638f96 ... 100 4 deaf6db8c953b126a13115f3a2c56f58 ... 100 .. ... ... ... 215 153069ad0d3e07a803fe9326224fd298 ... 100 216 c7170c4985a9f8e6066b10826e0bb4cf ... 100 217 d8892d092a53a07d780633678e35157a ... 100 218 e4266030abf9bc17de8d70905e6a224c ... 95 219 c91fcd42640cb8c096e8d23ca110f22d ... 25

[660 rows x 5 columns] ๐Ÿš€ create_base_extracted_entities entity_graph 0 <graphml xmlns="http://graphml.graphdrawing.or... ๐Ÿš€ create_summarized_entities entity_graph 0 <graphml xmlns="http://graphml.graphdrawing.or... โŒ create_base_entity_graph None โ ผ GraphRAG Indexer โ”œโ”€โ”€ Loading Input (InputFileType.text) - 1 files loaded (0 filtered) โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 100% 0:00:00 0:00:00 โ”œโ”€โ”€ create_base_text_units โ”œโ”€โ”€ create_base_extracted_entities โ”œโ”€โ”€ create_summarized_entities โ””โ”€โ”€ create_base_entity_graph โŒ Errors occurred during the pipeline run, see logs for more details.

logs: {"type": "error", "data": "Entity Extraction Error", "stack": "Traceback (most recent call last):\n File \"/Applications/0PFile/ProjectPython/graphrag/graphrag/index/graph/extractors/graph/graph_extractor.py\", line 118, in call\n result = await self._process_document(text, prompt_variables)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/Applications/0PFile/ProjectPython/graphrag/graphrag/index/graph/extractors/graph/graph_extractor.py\", line 146, in _process_document\n response = await self._llm(\n ^^^^^^^^^^^^^^^^\n File \"/Applications/0PFile/ProjectPython/graphrag/graphrag/llm/openai/json_parsing_llm.py\", line 34, in call\n result = await self._delegate(input, kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/Applications/0PFile/ProjectPython/graphrag/graphrag/llm/openai/openai_token_replacing_llm.py\", line 37, in call\n return await self._delegate(input, kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/Applications/0PFile/ProjectPython/graphrag/graphrag/llm/openai/openai_history_tracking_llm.py\", line 33, in call\n output = await self._delegate(input, kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/Applications/0PFile/ProjectPython/graphrag/graphrag/llm/base/caching_llm.py\", line 104, in call\n result = await self._delegate(input, kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/Applications/0PFile/ProjectPython/graphrag/graphrag/llm/base/rate_limiting_llm.py\", line 177, in call\n result, start = await execute_with_retry()\n ^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/Applications/0PFile/ProjectPython/graphrag/graphrag/llm/base/rate_limiting_llm.py\", line 159, in execute_with_retry\n async for attempt in retryer:\n File \"/opt/anaconda3/envs/ghw/lib/python3.12/site-packages/tenacity/asyncio/init.py\", line 166, in anext\n do = await self.iter(retry_state=self._retry_state)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/opt/anaconda3/envs/ghw/lib/python3.12/site-packages/tenacity/asyncio/init.py\", line 153, in iter\n result = await action(retry_state)\n ^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/opt/anaconda3/envs/ghw/lib/python3.12/site-packages/tenacity/_utils.py\", line 99, in inner\n return call(*args, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^\n File \"/opt/anaconda3/envs/ghw/lib/python3.12/site-packages/tenacity/init.py\", line 398, in \n self._add_action_func(lambda rs: rs.outcome.result())\n ^^^^^^^^^^^^^^^^^^^\n File \"/opt/anaconda3/envs/ghw/lib/python3.12/concurrent/futures/_base.py\", line 449, in result\n return self.get_result()\n ^^^^^^^^^^^^^^^^^^^\n File \"/opt/anaconda3/envs/ghw/lib/python3.12/concurrent/futures/_base.py\", line 401, in get_result\n raise self._exception\n File \"/Applications/0PFile/ProjectPython/graphrag/graphrag/llm/base/rate_limiting_llm.py\", line 162, in execute_with_retry\n await self._rate_limiter.acquire(input_tokens)\n File \"/Applications/0PFile/ProjectPython/graphrag/graphrag/llm/limiting/tpm_rpm_limiter.py\", line 32, in acquire\n await self._tpm_limiter.acquire(num_tokens)\n File \"/opt/anaconda3/envs/ghw/lib/python3.12/site-packages/aiolimiter/leakybucket.py\", line 95, in acquire\n raise ValueError(\"Can't acquire more than the maximum capacity\")\nValueError: Can't acquire more than the maximum capacity\n", "source": "Can't acquire more than the maximum capacity", "details": {"doc_index": 0, "text": "\ufeffThe Project Gutenberg eBook of A Christmas Carol\n \nThis ebook is for the use of anyone anywhere in the United States and\nmost other parts of the world at no cost and with almost no restrictions\nwhatsoever. You may copy it, give it away or re-use it under the terms\nof the Project Gutenberg License included with this ebook or online\nat www.gutenberg.org. If you are not located in the United States,\nyou will have to check the laws of the country where you are"}}

yakeworld commented 3 weeks ago

{"type": "error", "data": "Error executing verb \"cluster_graph\" in create_base_entity_graph: Columns must be same length as key", "stack": "Traceback (most recent call last):\n File \"/usr/local/lib/python3.12/site-packages/datashaper/workflow/workflow.py\", line 410, in _execute_verb\n result = node.verb.func(verb_args)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/usr/local/lib/python3.12/site-packages/graphrag/index/verbs/graph/clustering/cluster_graph.py\", line 102, in cluster_graph\n output_df[[level_to, to]] = pd.DataFrame(\n ~~~~~^^^^^^^^^^^^^^^^\n File \"/usr/local/lib/python3.12/site-packages/pandas/core/frame.py\", line 4299, in setitem\n self._setitem_array(key, value)\n File \"/usr/local/lib/python3.12/site-packages/pandas/core/frame.py\", line 4341, in _setitem_array\n check_key_length(self.columns, key, value)\n File \"/usr/local/lib/python3.12/site-packages/pandas/core/indexers/utils.py\", line 390, in check_key_length\n raise ValueError(\"Columns must be same length as key\")\nValueError: Columns must be same length as key\n", "source": "Columns must be same length as key", "details": null} {"type": "error", "data": "Error running pipeline!", "stack": "Traceback (most recent call last):\n File \"/usr/local/lib/python3.12/site-packages/graphrag/index/run.py\", line 323, in run_pipeline\n result = await workflow.run(context, callbacks)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/usr/local/lib/python3.12/site-packages/datashaper/workflow/workflow.py\", line 369, in run\n timing = await self._execute_verb(node, context, callbacks)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/usr/local/lib/python3.12/site-packages/datashaper/workflow/workflow.py\", line 410, in _execute_verb\n result = node.verb.func(verb_args)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/usr/local/lib/python3.12/site-packages/graphrag/index/verbs/graph/clustering/cluster_graph.py\", line 102, in cluster_graph\n output_df[[level_to, to]] = pd.DataFrame(\n ~~~~~^^^^^^^^^^^^^^^^\n File \"/usr/local/lib/python3.12/site-packages/pandas/core/frame.py\", line 4299, in setitem\n self._setitem_array(key, value)\n File \"/usr/local/lib/python3.12/site-packages/pandas/core/frame.py\", line 4341, in _setitem_array\n check_key_length(self.columns, key, value)\n File \"/usr/local/lib/python3.12/site-packages/pandas/core/indexers/utils.py\", line 390, in check_key_length\n raise ValueError(\"Columns must be same length as key\")\nValueError: Columns must be same length as key\n", "source": "Columns must be same length as key", "details": null}

yurochang commented 3 weeks ago

same problem

armolee commented 2 weeks ago

same problem. not found comment config

night666e commented 1 week ago

ๅŒๆ ท็š„้—ฎ้ข˜๏ผš [226 ่กŒ x 5 ๅˆ—] ๐Ÿš€ create_base_extracted_entities entity_graph 0 <graphml xmlns=โ€œhttp://graphml.graphdrawing.or... ๐Ÿš€ create_summarized_entities entity_graph 0 <graphml xmlns=โ€œhttp://graphml.graphdrawing.or... โŒ create_base_entity_graph ๆ—  โ ง GraphRAG ็ดขๅผ•ๅ™จ โ”œโ”€โ”€ ๅŠ ่ฝฝ่พ“ๅ…ฅ ๏ผˆInputFileType.text๏ผ‰ - ๅทฒๅŠ ่ฝฝ 1 ไธชๆ–‡ไปถ๏ผˆ0 ไธช่ฟ‡ๆปค๏ผ‰ โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 100% 0๏ผš00๏ผš00๏ผš00 โ”œโ”€โ”€ create_base_text_units โ”œโ”€โ”€ create_base_extracted_entities โ”œโ”€โ”€ create_summarized_entities โ””โ”€โ”€ create_base_entity_graph โŒ ็ฎก้“่ฟ่กŒ่ฟ‡็จ‹ไธญๅ‡บ็Žฐ้”™่ฏฏ๏ผŒ่ฏทๅ‚้˜…ๆ—ฅๅฟ—ไบ†่งฃๆ›ดๅคš่ฏฆๆƒ…ใ€‚

ๆœ‰่งฃๅ†ณๆ–นๆกˆๅ—

night666e commented 1 week ago

ๅœจ settings.yml.ๆ›ดๆ”นๆณจ้‡ŠๆŽ‰่กŒๆจกๅž‹ๆ”ฏๆŒjson๏ผŒๅฎƒๅฐ†่ตทไฝœ็”จใ€‚

https://github.com/microsoft/graphrag/issues/485#issuecomment-2232281213 ไป–่ฟ™ไธช่ฒŒไผผๅŽŸๆœฌๅฐฑๆ˜ฏtrue๏ผŒ้œ€่ฆๆ”นflaseๅ—