Closed goodmaney closed 4 months ago
same here, I am using gemma 9b, it only have 8k context window. I set local search with 5000 max tokens, and it is back to normal. Otherwise it will reports over_capacity in silent, you can see nothing.
local_search:
max_tokens: 5000
ChatCompletionChunk(id='chatcmpl-82228b8b-8279-44a5-bb8f-0f14c57ab4dd', choices=[Choice(delta=ChoiceDelta(content='', function_call=None, role='assistant', tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1720694560, model='gemma2-9b-it', object='chat.completion.chunk', service_tier=None, system_fingerprint=None, usage=None, x_groq={'id': 'req_01j2gp6nqvf5zsbbszhywpceqv'})
ChatCompletionChunk(id='chatcmpl-82228b8b-8279-44a5-bb8f-0f14c57ab4dd', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=None), finish_reason='stop', index=0, logprobs=None)], created=1720694560, model='gemma2-9b-it', object='chat.completion.chunk', service_tier=None, system_fingerprint=None, usage=None, x_groq={'id': 'req_01j2gp6nqvf5zsbbszhywpceqv', 'error': 'over_capacity'})
same here, I am using gemma 9b, it only have 8k context window. I set local search with 5000 max tokens, and it is back to normal. Otherwise it will reports over_capacity in silent, you can see nothing.
local_search: max_tokens: 5000
ChatCompletionChunk(id='chatcmpl-82228b8b-8279-44a5-bb8f-0f14c57ab4dd', choices=[Choice(delta=ChoiceDelta(content='', function_call=None, role='assistant', tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1720694560, model='gemma2-9b-it', object='chat.completion.chunk', service_tier=None, system_fingerprint=None, usage=None, x_groq={'id': 'req_01j2gp6nqvf5zsbbszhywpceqv'}) ChatCompletionChunk(id='chatcmpl-82228b8b-8279-44a5-bb8f-0f14c57ab4dd', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=None), finish_reason='stop', index=0, logprobs=None)], created=1720694560, model='gemma2-9b-it', object='chat.completion.chunk', service_tier=None, system_fingerprint=None, usage=None, x_groq={'id': 'req_01j2gp6nqvf5zsbbszhywpceqv', 'error': 'over_capacity'})
I set 5000 it not work. but 4200 works ,and it look like the maximum. Glm4 is 128k context window. I dont know if the max_tokens is relate to LLM in local search. What is your embedding?
Consolidating alternate model issues here: #657
same here, I am using gemma 9b, it only have 8k context window. I set local search with 5000 max tokens, and it is back to normal. Otherwise it will reports over_capacity in silent, you can see nothing.
local_search: max_tokens: 5000
ChatCompletionChunk(id='chatcmpl-82228b8b-8279-44a5-bb8f-0f14c57ab4dd', choices=[Choice(delta=ChoiceDelta(content='', function_call=None, role='assistant', tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1720694560, model='gemma2-9b-it', object='chat.completion.chunk', service_tier=None, system_fingerprint=None, usage=None, x_groq={'id': 'req_01j2gp6nqvf5zsbbszhywpceqv'}) ChatCompletionChunk(id='chatcmpl-82228b8b-8279-44a5-bb8f-0f14c57ab4dd', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=None), finish_reason='stop', index=0, logprobs=None)], created=1720694560, model='gemma2-9b-it', object='chat.completion.chunk', service_tier=None, system_fingerprint=None, usage=None, x_groq={'id': 'req_01j2gp6nqvf5zsbbszhywpceqv', 'error': 'over_capacity'})
I set 5000 it not work. but 4200 works ,and it look like the maximum. Glm4 is 128k context window. I dont know if the max_tokens is relate to LLM in local search. What is your embedding?
thanks!!!大佬!!!好人一生平安!!!!
Describe the bug
Global search works well. Local search not report error but respond null. I use xinference load llm and embedding,the embedding is working when execute Local Search
Steps to reproduce
my test file content
my prompt
the embedding running status
the final response
Expected Behavior
No response
GraphRAG Config Used
encoding_model: cl100k_base skip_workflows: [] llm: api_key: ${GRAPHRAG_API_KEY} type: openai_chat # or azure_openai_chat model: glm4-chat-test model_supports_json: true[or fales]
api_base: http://127.0.0.1:9997/v1
parallelization: stagger: 0.3
async_mode: threaded # or asyncio
embeddings:
async_mode: threaded # or asyncio llm:
chunks: size: 300 overlap: 100 group_by_columns: [id] # by default, we don't allow chunks to cross documents
input: type: file # or blob file_type: text # or csv base_dir: "input" file_encoding: utf-8 file_pattern: ".*\.txt$"
cache: type: file # or blob base_dir: "cache"
storage: type: file # or blob base_dir: "output/${timestamp}/artifacts"
reporting: type: file # or console, blob base_dir: "output/${timestamp}/reports"
entity_extraction:
prompt: "prompts/entity_extraction.txt" entity_types: [organization,person,geo,event] max_gleanings: 0
summarize_descriptions:
prompt: "prompts/summarize_descriptions.txt" max_length: 500
claim_extraction:
prompt: "prompts/claim_extraction.txt" description: "Any claims or facts that could be relevant to information discovery." max_gleanings: 0
community_reports:
prompt: "prompts/community_report.txt" max_length: 2000 max_input_length: 8000
cluster_graph: max_cluster_size: 10
embed_graph: enabled: false # if true, will generate node2vec embeddings for nodes
umap: enabled: false # if true, will generate UMAP embeddings for nodes
snapshots: graphml: false raw_entities: false top_level_nodes: false
local_search:
global_search:
Logs and screenshots
some indexing-engine.log
File "/home/xx/anaconda3/envs/graphrag/lib/python3.11/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 2 column 1 (char 1) 00:12:44,69 graphrag.index.reporting.file_workflow_callbacks INFO Community Report Extraction Error details=None
logs.json
{"type": "error", "data": "Community Report Extraction Error", "stack": "Traceback (most recent call last):\n File \"/home/xx/graphrag/graphrag/index/graph/extractors/community_reports/community_reports_extractor.py\", line 58, in call\n await self._llm(\n File \"/home/xx/graphrag/graphrag/llm/openai/json_parsing_llm.py\", line 34, in call\n result = await self._delegate(input, kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/xx/graphrag/graphrag/llm/openai/openai_token_replacing_llm.py\", line 37, in call\n return await self._delegate(input, kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/xx/graphrag/graphrag/llm/openai/openai_history_tracking_llm.py\", line 33, in call\n output = await self._delegate(input, kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/xx/graphrag/graphrag/llm/base/caching_llm.py\", line 104, in call\n result = await self._delegate(input, kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/xx/graphrag/graphrag/llm/base/rate_limiting_llm.py\", line 177, in call\n result, start = await execute_with_retry()\n ^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/xx/graphrag/graphrag/llm/base/rate_limiting_llm.py\", line 159, in execute_with_retry\n async for attempt in retryer:\n File \"/home/xx/anaconda3/envs/graphrag/lib/python3.11/site-packages/tenacity/asyncio/init.py\", line 166, in anext\n do = await self.iter(retry_state=self._retry_state)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/xx/anaconda3/envs/graphrag/lib/python3.11/site-packages/tenacity/asyncio/init.py\", line 153, in iter\n result = await action(retry_state)\n ^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/xx/anaconda3/envs/graphrag/lib/python3.11/site-packages/tenacity/_utils.py\", line 99, in inner\n return call(*args, kwargs)\n ^^^^^^^^^^^^^^^^^^^^^\n File \"/home/xx/anaconda3/envs/graphrag/lib/python3.11/site-packages/tenacity/init.py\", line 398, in\n self._add_action_func(lambda rs: rs.outcome.result())\n ^^^^^^^^^^^^^^^^^^^\n File \"/home/xx/anaconda3/envs/graphrag/lib/python3.11/concurrent/futures/_base.py\", line 449, in result\n return self.get_result()\n ^^^^^^^^^^^^^^^^^^^\n File \"/home/xx/anaconda3/envs/graphrag/lib/python3.11/concurrent/futures/_base.py\", line 401, in get_result\n raise self._exception\n File \"/home/xx/graphrag/graphrag/llm/base/rate_limiting_llm.py\", line 165, in execute_with_retry\n return await do_attempt(), start\n ^^^^^^^^^^^^^^^^^^\n File \"/home/xx/graphrag/graphrag/llm/base/rate_limiting_llm.py\", line 147, in do_attempt\n return await self._delegate(input, kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/xx/graphrag/graphrag/llm/base/base_llm.py\", line 48, in call\n return await self._invoke_json(input, kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/xx/graphrag/graphrag/llm/openai/openai_chat_llm.py\", line 82, in _invoke_json\n result = await generate()\n ^^^^^^^^^^^^^^^^\n File \"/home/xx/graphrag/graphrag/llm/openai/openai_chat_llm.py\", line 74, in generate\n await self._native_json(input, {**kwargs, \"name\": call_name})\n File \"/home/xx/graphrag/graphrag/llm/openai/openai_chat_llm.py\", line 108, in _native_json\n json_output = try_parse_json_object(raw_output)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/xx/graphrag/graphrag/llm/openai/utils.py\", line 93, in try_parse_json_object\n result = json.loads(input)\n ^^^^^^^^^^^^^^^^^\n File \"/home/xx/anaconda3/envs/graphrag/lib/python3.11/json/init.py\", line 346, in loads\n return _default_decoder.decode(s)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/xx/anaconda3/envs/graphrag/lib/python3.11/json/decoder.py\", line 337, in decode\n obj, end = self.raw_decode(s, idx=_w(s, 0).end())\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/xx/anaconda3/envs/graphrag/lib/python3.11/json/decoder.py\", line 355, in raw_decode\n raise JSONDecodeError(\"Expecting value\", s, err.value) from None\njson.decoder.JSONDecodeError: Expecting value: line 2 column 1 (char 1)\n", "source": "Expecting value: line 2 column 1 (char 1)", "details": null}
Additional Information