Closed BovineOverlord closed 3 days ago
Hi! My general rule of thumb when facing this issues is:
Can you please check your cache entries for Entity Extraction to check if the LLM is providing faulty responses?
Entity extraction directory is empty. I attempted with 2 other different models and was met with the same result.
Facing the same thing. cache/entity_extraction is empty. same exact error in the logs.
same error
same error
same error
same error:
this is my indexing-engine.log: indexing-engine.log
same error, Entity extraction directory is empty.
same error:
But entity_extraction directory is not empty.
settings.yaml
same error lol But entity_extraction and summarize_descriptions directories are also not empty.
same error why
same problem.
+1
+1
Consolidating alternate model issues here: #657
Describe the bug
{"type": "error", "data": "Error executing verb \"cluster_graph\" in create_base_entity_graph: Columns must be same length as key", "stack": "Traceback (most recent call last):\n File \"C:\Program Files\Python310\lib\site-packages\datashaper\workflow\workflow.py\", line 410, in _execute_verb\n result = node.verb.func(verb_args)\n File \"C:\Program Files\Python310\lib\site-packages\graphrag\index\verbs\graph\clustering\cluster_graph.py\", line 102, in cluster_graph\n output_df[[level_to, to]] = pd.DataFrame(\n File \"C:\Program Files\Python310\lib\site-packages\pandas\core\frame.py\", line 4299, in setitem\n self._setitem_array(key, value)\n File \"C:\Program Files\Python310\lib\site-packages\pandas\core\frame.py\", line 4341, in _setitem_array\n check_key_length(self.columns, key, value)\n File \"C:\Program Files\Python310\lib\site-packages\pandas\core\indexers\utils.py\", line 390, in check_key_length\n raise ValueError(\"Columns must be same length as key\")\nValueError: Columns must be same length as key\n", "source": "Columns must be same length as key", "details": null} {"type": "error", "data": "Error running pipeline!", "stack": "Traceback (most recent call last):\n File \"C:\Program Files\Python310\lib\site-packages\graphrag\index\run.py\", line 323, in run_pipeline\n result = await workflow.run(context, callbacks)\n File \"C:\Program Files\Python310\lib\site-packages\datashaper\workflow\workflow.py\", line 369, in run\n timing = await self._execute_verb(node, context, callbacks)\n File \"C:\Program Files\Python310\lib\site-packages\datashaper\workflow\workflow.py\", line 410, in _execute_verb\n result = node.verb.func(verb_args)\n File \"C:\Program Files\Python310\lib\site-packages\graphrag\index\verbs\graph\clustering\cluster_graph.py\", line 102, in cluster_graph\n output_df[[level_to, to]] = pd.DataFrame(\n File \"C:\Program Files\Python310\lib\site-packages\pandas\core\frame.py\", line 4299, in setitem\n self._setitem_array(key, value)\n File \"C:\Program Files\Python310\lib\site-packages\pandas\core\frame.py\", line 4341, in _setitem_array\n check_key_length(self.columns, key, value)\n File \"C:\Program Files\Python310\lib\site-packages\pandas\core\indexers\utils.py\", line 390, in check_key_length\n raise ValueError(\"Columns must be same length as key\")\nValueError: Columns must be same length as key\n", "source": "Columns must be same length as key", "details": null}
Steps to reproduce
I was using a local ollama model to use the tool. It ran fine and loaded the test file before the error occurred.
Expected Behavior
The tool should have proceeded with the following step "create_base_text_units" rather than cease operation. It appears to be a bug with the graphing function.
GraphRAG Config Used
encoding_model: cl100k_base skip_workflows: [] llm: api_key: ${GRAPHRAG_API_KEY} type: openai_chat # or azure_openai_chat model: command-r-plus:104b-q4_0 model_supports_json: true # recommended if this is available for your model.
max_tokens: 2000
request_timeout: 180.0
api_base: http://localhost:11434/v1
api_version: 2024-02-15-preview
organization:
deployment_name:
tokens_per_minute: 150_000 # set a leaky bucket throttle
requests_per_minute: 10_000 # set a leaky bucket throttle
max_retries: 1
max_retry_wait: 10.0
sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
concurrent_requests: 1 # the number of parallel inflight requests that may be made
parallelization: stagger: 0.3
num_threads: 50 # the number of threads to use for parallel processing
async_mode: threaded # or asyncio
embeddings:
parallelization: override the global parallelization settings for embeddings
async_mode: threaded # or asyncio llm: api_key: ${GRAPHRAG_API_KEY} type: openai_embedding # or azure_openai_embedding model: qwen2:7b-instruct
api_base: http://localhost:11434/api
No change to the remainder
Logs and screenshots
Additional Information