Closed Jainil-Gosalia closed 1 month ago
In configuration(yaml
), you are using mistral as an embedding model and that might be causing the inhomogeneous dimension. You can use models from nomic-ai or mixedbread.
When I faced the issue, I created a repository for deploying Hugging Face models to local endpoints, offering functionality similar to OpenAI APIs. You can find the repo here: https://github.com/rushizirpe/open-llm-server
Also, I've prepared a Colab notebook for the Graphrag Demo. You might want to take a look: https://colab.research.google.com/drive/1uhFDnih1WKrSRQHisU-L6xw6coapgR51?usp=sharing. If you don't have access to GPUs like the A100, you'll need a GROQ_API_KEY (which is free with certain limitations), you can obtain it from: https://console.groq.com/keys
Consolidating alternate model issues here: https://github.com/microsoft/graphrag/issues/657
Describe the issue
I was trying to run graphRAG using llama_cpp. Got the following issue:
❌ create_final_entities ⠼ GraphRAG Indexer ├── Loading Input (text) - 1 files loaded (0 filtered) ━━━━━━ 100% 0:00:… 0:00:… ├── create_base_text_units ├── create_base_extracted_entities ├── create_summarized_entities ├── create_base_entity_graph None ⠴ GraphRAG Indexer ├── Loading Input (text) - 1 files loaded (0 filtered) ━━━━━━ 100% 0:00:… 0:00:… ├── create_base_text_units ├── create_base_extracted_entities ├── create_summarized_entities ├── create_base_entity_graph ⠴ GraphRAG Indexer ├── Loading Input (text) - 1 files loaded (0 filtered) ━━━━━━ 100% 0:00:… 0:00:… ├── create_base_text_units ├── create_base_extracted_entities ├── create_summarized_entities ├── create_base_entity_graph └── create_final_entities ❌ Errors occurred during the pipeline run, see logs for more details.
Steps to reproduce
Use the settings.yaml file to replicate the issue
GraphRAG Config Used
The settings.yaml is as follows:
Logs and screenshots
Indexing Engine Log file shows this:
Logs.json File shows this:
{"type": "error", "data": "Error executing verb \"text_embed\" in create_final_entities: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (16,) + inhomogeneous part.", "stack": "Traceback (most recent call last):\n File \"/usr/local/lib/python3.10/dist-packages/datashaper/workflow/workflow.py\", line 415, in _execute_verb\n result = await result\n File \"/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/text_embed.py\", line 105, in text_embed\n return await _text_embed_in_memory(\n File \"/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/text_embed.py\", line 130, in _text_embed_in_memory\n result = await strategy_exec(texts, callbacks, cache, strategy_args)\n File \"/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py\", line 61, in run\n embeddings = await _execute(llm, text_batches, ticker, semaphore)\n File \"/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py\", line 105, in _execute\n results = await asyncio.gather(*futures)\n File \"/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py\", line 100, in embed\n result = np.array(chunk_embeddings.output)\nValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (16,) + inhomogeneous part.\n", "source": "setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (16,) + inhomogeneous part.", "details": null}
{"type": "error", "data": "Error running pipeline!", "stack": "Traceback (most recent call last):\n File \"/usr/local/lib/python3.10/dist-packages/graphrag/index/run.py\", line 323, in run_pipeline\n result = await workflow.run(context, callbacks)\n File \"/usr/local/lib/python3.10/dist-packages/datashaper/workflow/workflow.py\", line 369, in run\n timing = await self._execute_verb(node, context, callbacks)\n File \"/usr/local/lib/python3.10/dist-packages/datashaper/workflow/workflow.py\", line 415, in _execute_verb\n result = await result\n File \"/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/text_embed.py\", line 105, in text_embed\n return await _text_embed_in_memory(\n File \"/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/text_embed.py\", line 130, in _text_embed_in_memory\n result = await strategy_exec(texts, callbacks, cache, strategy_args)\n File \"/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py\", line 61, in run\n embeddings = await _execute(llm, text_batches, ticker, semaphore)\n File \"/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py\", line 105, in _execute\n results = await asyncio.gather(*futures)\n File \"/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py\", line 100, in embed\n result = np.array(chunk_embeddings.output)\nValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (16,) + inhomogeneous part.\n", "source": "setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (16,) + inhomogeneous part.", "details": null}
Additional Information