microsoft / graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system
https://microsoft.github.io/graphrag/
MIT License
20.11k stars 1.97k forks source link

[Bug]: How to solve ValueError: Columns must be same length as key? #670

Closed Da1zheng closed 4 months ago

Da1zheng commented 4 months ago

Describe the bug

I rewrite the English prompts to Chinese prompts. In addition to that I also set the encoding of some files in the directory /graphrag/config/models to utf-8, after that running GraphRAG occurs with the following error:

File "C:\Users\22051.conda\envs\graphrag\Lib\site-packages\graphrag\index\verbs\graph\clustering\cluster_graph.py", line 102, in cluster_graph output_df[[level_to, to]] = pd.DataFrame(


  File "C:\Users\22051\.conda\envs\graphrag\Lib\site-packages\pandas\core\frame.py", line 4299, in __setitem__
    self._setitem_array(key, value)
  File "C:\Users\22051\.conda\envs\graphrag\Lib\site-packages\pandas\core\frame.py", line 4341, in _setitem_array
    check_key_length(self.columns, key, value)
  File "C:\Users\22051\.conda\envs\graphrag\Lib\site-packages\pandas\core\indexers\utils.py", line 390, in check_key_length
    raise ValueError("Columns must be same length as key")
ValueError: Columns must be same length as key
15:58:11,902 graphrag.index.reporting.file_workflow_callbacks INFO Error running pipeline! details=None
######
How to solve this problem? Thanks!

### Steps to reproduce

_No response_

### Expected Behavior

_No response_

### GraphRAG Config Used

I replace the entity type to as following:
entity_extraction:
  prompt: "prompts/entity_extraction.txt"
  entity_types: [人物,组织,事件,宝物,生物,时间,文化背景,符号和标记]
  max_gleanings: 0

### Logs and screenshots

![image](https://github.com/user-attachments/assets/c6266cdd-4966-474b-aa24-7ca0207a8c8c)
![image](https://github.com/user-attachments/assets/d2c76f32-f046-486d-ad36-4f113f681a76)

### Additional Information

- GraphRAG Version:0.1.1
- Operating System:Windows 11
- Python Version:3.11.0
- Related Issues:
Anthonyfhd commented 4 months ago

i have the same problem,i dont know how to deal

johnmendez2 commented 4 months ago

Same problem here

etiennebonnafoux commented 4 months ago

Same problem also. It has always been reported #653, #631 and #514 .

etiennebonnafoux commented 4 months ago

The second link, there is a hint of using a different model. But we don't have a list of wich model are working and which are not. As for myself I use a gpt-35-turbo and a text-embedding-3-small on Azure

natoverse commented 4 months ago

Closing as duplicate