microsoft / graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system
https://microsoft.github.io/graphrag/
MIT License
17.27k stars 1.65k forks source link

global查询出错 #682

Closed wangruidedie closed 4 weeks ago

wangruidedie commented 1 month ago

Describe the bug

local查询正常 但global查询如下:

INFO: Reading settings from ragtest\settings.yaml creating llm client with {'api_key': 'REDACTED,len=6', 'type': "openai_chat", 'model': 'qwen2:latest', 'max_tokens': 4000, 'request_timeout': 180.0, 'api_base': 'http://127.0.0.1:11434/v1', 'api_version': None, 'organization': None, 'proxy': None, 'cognitive_services_endpoint': None, 'deployment_name': None, 'model_supports_json': True, 'tokens_per_minute': 0, 'requests_per_minute': 0, 'max_retries': 10, 'max_retry_wait': 10.0, 'sleep_on_rate_limit_recommendation': True, 'concurrent_requests': 25} Error parsing search response json Traceback (most recent call last): File "D:\software\anaconda3\envs\graphrag\Lib\site-packages\graphrag\query\structured_search\global_search\search.py", line 194, in _map_response_single_batch processed_response = self.parse_search_response(search_response) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda3\envs\graphrag\Lib\site-packages\graphrag\query\structured_search\global_search\search.py", line 232, in parse_search_response parsed_elements = json.loads(search_response)["points"] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda3\envs\graphrag\Lib\json__init__.py", line 346, in loads return _default_decoder.decode(s) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda3\envs\graphrag\Lib\json\decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\software\anaconda3\envs\graphrag\Lib\json\decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

SUCCESS: Global Search Response: I am sorry but I am unable to answer this question given the provided data.

Steps to reproduce

No response

Expected Behavior

No response

GraphRAG Config Used

No response

Logs and screenshots

No response

Additional Information

aiChatGPT35User123 commented 1 month ago

你解决了吗,我也是这个错误

kohlz commented 1 month ago

相似问题 可以打印出来看看:

Similar issue, I print the input out and there is actually an respond from LLM for query. Is this issue related to LLM?

INFO: Reading settings from test3\settings.yaml creating llm client with {'api_key': 'REDACTED,len=4', 'type': "openai_chat", 'model': 'gpt-3.5-turbo', 'max_tokens': 4000, 'request_timeout': 180.0, 'api_base': 'EDITED, model ran locally', 'api_version': None, 'organization': None, 'proxy': None, 'cognitive_services_endpoint': None, 'deployment_name': None, 'model_supports_json': True, 'tokens_per_minute': 0, 'requests_per_minute': 0, 'max_retries': 10, 'max_retry_wait': 10.0, 'sleep_on_rate_limit_recommendation': True, 'concurrent_requests': 25} The given text is discussing two separate communities related to the intersection of deep learning and stem cell research. The first community is titled "# Deep Learning and Stem Cell Research Community," which has a high occurrence weight of 1.0. This community focuses on the advancements in deep learning applied to stem cell research, particularly in cancer stem cells (CSCs), and how it's used as a tool in the field. The text highlights the mini-review as a concise overview and a knowledge hub for researchers, mentioning the importance of deep learning in understanding CSCs and the need for regulatory and ethical considerations.

The second community is titled "# Deep Learning in Stem Cell Research Community," which has a moderate occurrence weight of 1.0. This community also revolves around the integration of deep learning in stem cell research, with a central focus on stem cell research itself. The text mentions the use of Convolutional Neural Networks (CNNs) in research and the potential implications for breakthroughs, ethical concerns, and the collaborative nature of the community.

Both communities are characterized by their focus on the application of deep learning in stem cell research and the potential impact on scientific advancements, regulatory frameworks, and ethical considerations. The given text is discussing two separate communities related to the intersection of deep learning and stem cell research. The first community is titled "# Deep Learning and Stem Cell Research Community," which has a high occurrence weight of 1.0. This community focuses on the advancements in deep learning applied to stem cell research, particularly in cancer stem cells (CSCs), and how it's used as a tool in the field. The text highlights the mini-review as a concise overview and a knowledge hub for researchers, mentioning the importance of deep learning in understanding CSCs and the need for regulatory and ethical considerations.The second community is titled "# Deep Learning in Stem Cell Research Community," which has a moderate occurrence weight of 1.0. This community also revolves around the integration of deep learning in stem cell research, with a central focus on stem cell research itself. The text mentions the use of Convolutional Neural Networks (CNNs) in research and the potential implications for breakthroughs, ethical concerns, and the collaborative nature of the community.Both communities are characterized by their focus on the application of deep learning in stem cell research and the potential impact on scientific advancements, regulatory frameworks, and ethical considerations. Error parsing search response json Traceback (most recent call last): File "C:\Users\zhuhz\AppData\Local\Programs\Python\Python311\Lib\site-packages\graphrag\query\structured_search\global_search\search.py", line 189, in _map_response_single_batch processed_response = self.parse_search_response(search_response) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\zhuhz\AppData\Local\Programs\Python\Python311\Lib\site-packages\graphrag\query\structured_search\global_search\search.py", line 238, in parse_search_response raise ValueError("No JSON object found in search response") ValueError: No JSON object found in search response

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:\Users\zhuhz\AppData\Local\Programs\Python\Python311\Lib\site-packages\graphrag\query\structured_search\global_search\search.py", line 195, in _map_response_single_batch processed_response = self.parse_search_response(search_response) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\zhuhz\AppData\Local\Programs\Python\Python311\Lib\site-packages\graphrag\query\structured_search\global_search\search.py", line 238, in parse_search_response raise ValueError("No JSON object found in search response") ValueError: No JSON object found in search response

SUCCESS: Global Search Response: I am sorry but I am unable to answer this question given the provided data.

wangruidedie commented 1 month ago

模型换成gpt-4o-mini就没有这个问题了

xldistance commented 1 month ago

难道是全局搜索只能用训练数据的模型来提问吗

goodmaney commented 1 month ago

一样 之前有次成功global search,之后就复现不了了

najsword commented 1 month ago

一样 之前有次成功global search,之后就复现不了了

你是用什么工具部署嵌入模型,嵌入模型是什么

inspirewind commented 1 month ago

可能是你的LLM返回的json格式有问题

aiChatGPT35User123 commented 1 month ago

可能是你的LLM返回的json格式有问题

是的,因为我是用ollama启动的模型,在创建模型的时候,温度系数调的高了,导致回复的格式不对

kakalong136 commented 1 month ago

可能是你的LLM返回的json格式有问题

是的,因为我是用ollama启动的模型,在创建模型的时候,温度系数调的高了,导致回复的格式不对

在创建模型的时候,温度系数调的高了 请问这个温度系数是在Rag的setting设置 还是在大模型那里设置????

goodmaney commented 1 month ago

一样 之前有次成功global search,之后就复现不了了

你是用什么工具部署嵌入模型,嵌入模型是什么

xinference glm4-chat bce-embedding-base-v1

xldistance commented 1 month ago

可能是你的LLM返回的json格式有问题

是的,因为我是用ollama启动的模型,在创建模型的时候,温度系数调的高了,导致回复的格式不对

是模型的问题,我现在用llama3.1 8b chinese模型可以正常用global查询,temperature设置为0.3

goodmaney commented 1 month ago

llama3.1 8b chinese

还是会报错,但报完错会输出一些内容。我用的xinference你用的ollama吗?

xldistance commented 1 month ago

llama3.1 8b chinese

还是会报错,但报完错会输出一些内容。我用的xinference你用的ollama吗?

可以正常回答问题了吧,就是包含point这个key的字典缺失

XiaoTongDeng commented 1 month ago

llama3.1 8b chinese

还是会报错,但报完错会输出一些内容。我用的xinference你用的ollama吗?

请问xinference如何接入graphrag

goodmaney commented 1 month ago

llama3.1 8b chinese

还是会报错,但报完错会输出一些内容。我用的xinference你用的ollama吗?

请问xinference如何接入graphrag

setting里api_base:填xinference-local启动时后面的端口就好了,默认一般是ht tp://127.0.0.1:9997/v1,一般embedding也填这个地址。xinference也可以再开另一个端口跑embedding

XiaoTongDeng commented 1 month ago

llama3.1 8b chinese

还是会报错,但报完错会输出一些内容。我用的xinference你用的ollama吗?

请问xinference如何接入graphrag

setting里api_base:填xinference-local启动时后面的端口就好了,默认一般是ht tp://127.0.0.1:9997/v1,一般embedding也填这个地址。xinference也可以再开另一个端口跑embedding

402c21d06b047583798b0c0dec238dd 直接改setting的话我会卡在这儿没动静。。。

najsword commented 1 month ago

有问题进群976131420,找心如薄荷解决。

natoverse commented 1 month ago

We have checks in place to filter out any community responses that the LLM deems low-relevance, which can sometimes mean no relevant summaries are collected together and the end result is that it can't answer the question. This is a somewhat cautious approach to avoid hallucination. You may be able to get better results if you tune the prompt to your domain for how the LLM assesses relevance and assigns the "Importance Score" here.

github-actions[bot] commented 1 month ago

This issue has been marked stale due to inactivity after repo maintainer or community member responses that request more information or suggest a solution. It will be closed after five additional days.

github-actions[bot] commented 4 weeks ago

This issue has been closed after being marked as stale for five days. Please reopen if needed.