Open realSAH opened 1 month ago
Hi @realSAH Can you please share you config yaml file?
encoding_model: cl100k_base
skip_workflows: []
llm:
api_key: "qwetetrrwety"
# type: openai_chat # or azure_openai_chat
type: azure_openai_chat
api_base: "htsdfsdfasdfasdfasdf"
api_version: 2024-02-15-preview
deployment_name: gpt-4-turbo-default
model_supports_json: true # recommended if this is available for your model.
# max_tokens: 4000
# request_timeout: 180.0
# api_base: https://<instance>.openai.azure.com
# api_version: 2024-02-15-preview
# organization: <organization_id>
# deployment_name: <azure_model_deployment_name>
# tokens_per_minute: 150_000 # set a leaky bucket throttle
# requests_per_minute: 10_000 # set a leaky bucket throttle
# max_retries: 10
# max_retry_wait: 10.0
# sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
# concurrent_requests: 25 # the number of parallel inflight requests that may be made
parallelization:
stagger: 0.3
# num_threads: 50 # the number of threads to use for parallel processing
async_mode: threaded # or asyncio
embeddings:
## parallelization: override the global parallelization settings for embeddings
async_mode: threaded # or asyncio
llm:
# api_key: ${GRAPHRAG_API_KEY}
# type: openai_embedding # or azure_openai_embedding
# model: text-embedding-3-small
api_key: "wefwef"
type: azure_openai_chat
api_base: "qergwerg"
api_version: 2024-02-15-preview
deployment_name: text-embedding-ada-002
# api_base: https://<instance>.openai.azure.com
# api_version: 2024-02-15-preview
# organization: <organization_id>
# deployment_name: <azure_model_deployment_name>
# tokens_per_minute: 150_000 # set a leaky bucket throttle
# requests_per_minute: 10_000 # set a leaky bucket throttle
# max_retries: 10
# max_retry_wait: 10.0
# sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
# concurrent_requests: 25 # the number of parallel inflight requests that may be made
# batch_size: 16 # the number of documents to send in a single request
# batch_max_tokens: 8191 # the maximum number of tokens to send in a single request
# target: required # or optional
chunks:
size: 300
overlap: 100
group_by_columns: [id] # by default, we don't allow chunks to cross documents
input:
type: file # or blob
file_type: text # or csv
base_dir: "input"
file_encoding: utf-8
# file_pattern: ".*\\.txt$"
file_pattern: ".*\\.md$"
cache:
type: file # or blob
base_dir: "cache"
# connection_string: <azure_blob_storage_connection_string>
# container_name: <azure_blob_storage_container_name>
storage:
type: file # or blob
base_dir: "output/${timestamp}/artifacts"
# connection_string: <azure_blob_storage_connection_string>
# container_name: <azure_blob_storage_container_name>
reporting:
type: file # or console, blob
base_dir: "output/${timestamp}/reports"
# connection_string: <azure_blob_storage_connection_string>
# container_name: <azure_blob_storage_container_name>
entity_extraction:
## llm: override the global llm settings for this task
## parallelization: override the global parallelization settings for this task
## async_mode: override the global async_mode settings for this task
prompt: "prompts/entity_extraction.txt"
entity_types: [organization,nurse,doctor,drug,order,test,plan,radiology,imaging,procedure,diagnosis,medi
cation,allergy,condition,patient,condition]
max_gleanings: 0
summarize_descriptions:
## llm: override the global llm settings for this task
## parallelization: override the global parallelization settings for this task
## async_mode: override the global async_mode settings for this task
prompt: "prompts/summarize_descriptions.txt"
max_length: 500
claim_extraction:
## llm: override the global llm settings for this task
## parallelization: override the global parallelization settings for this task
## async_mode: override the global async_mode settings for this task
# enabled: true
prompt: "prompts/claim_extraction.txt"
description: "Any claims or facts that could be relevant to information discovery."
max_gleanings: 0
community_report:
## llm: override the global llm settings for this task
## parallelization: override the global parallelization settings for this task
## async_mode: override the global async_mode settings for this task
prompt: "prompts/community_report.txt"
max_length: 2000
max_input_length: 8000
cluster_graph:
max_cluster_size: 10
embed_graph:
enabled: false # if true, will generate node2vec embeddings for nodes
# num_walks: 10
# walk_length: 40
# window_size: 2
# iterations: 3
# random_seed: 597832
umap:
enabled: false # if true, will generate UMAP embeddings for nodes
snapshots:
graphml: false
raw_entities: false
top_level_nodes: false
local_search:
# text_unit_prop: 0.5
# community_prop: 0.1
# conversation_history_max_turns: 5
# top_k_mapped_entities: 10
# top_k_relationships: 10
# max_tokens: 12000
global_search:
# max_tokens: 12000
# data_max_tokens: 12000
# map_max_tokens: 1000
# reduce_max_tokens: 2000
# concurrency: 32
Can you try again with version 0.2.2. We believe there was a file loading issue during query that is resolved in the latest release.
@natoverse
Hi. I got the same error with 0.2.2
.
While I repeated everything from scratch, I noticed that while building indices, not all processes completed successfully, and upon inspection of logs, I read that there were rateLimit
errors.
I have 2 things tow observations to make here:
UPDATE:
Thankfully, the library was smart enough to recognize prior work, I repeated ran the index command (5 times) in order to get around the rateLimit
failure till the indexing ran sucessfully and all worked nicely from there.
So, I guess this issue will boil down to: can we have a smart function that automatically waits when encountering rateLimit
errors? The current behaviour seems to be simply fail and stop.
Thanks,
Hi @realSAH We have a flag that parses the LLM response and waits the suggested time when getting any throttling error, like rate limit exceeded. This is on by default. Can you please share one of the lines where you're seeing the error?
@AlonsoGuevara
Hi. I just tried again from scratch using 0.3.1
. The persists, and it seems graphrag
is not patient enough with rateLimit
The log file is gigantic (7 MB), but here are the last few lines from which I concluded that rateLimit
is to blame.
\n File \"/home/alex/venvs/rag_test/lib/python3.11/site-packages/graphrag/index/verbs/entities/summarize/description_summarize.py\", line 184, in <listcomp>\n await get_resolved_entities
(row, semaphore) for row in output.itertuples()\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/alex/venvs/rag_test/lib/python3.11/site-packages/graphrag/index/verbs/entitie
s/summarize/description_summarize.py\", line 147, in get_resolved_entities\n results = await asyncio.gather(*futures)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/alex/ve
nvs/rag_test/lib/python3.11/site-packages/graphrag/index/verbs/entities/summarize/description_summarize.py\", line 167, in do_summarize_descriptions\n results = await strategy_exec(\n
^^^^^^^^^^^^^^^^^^^^\n File \"/home/alex/venvs/rag_test/lib/python3.11/site-packages/graphrag/index/verbs/entities/summarize/strategies/graph_intelligence/run_graph_intelligence.p
y\", line 34, in run\n return await run_summarize_descriptions(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/alex/venvs/rag_test/lib/python3.11/site-packages/graphrag/ind
ex/verbs/entities/summarize/strategies/graph_intelligence/run_graph_intelligence.py\", line 67, in run_summarize_descriptions\n result = await extractor(items=items, descriptions=descript
ions)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/alex/venvs/rag_test/lib/python3.11/site-packages/graphrag/index/graph/extractors/summarize/descrip
tion_summary_extractor.py\", line 73, in __call__\n result = await self._summarize_descriptions(items, descriptions)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\
n File \"/home/alex/venvs/rag_test/lib/python3.11/site-packages/graphrag/index/graph/extractors/summarize/description_summary_extractor.py\", line 106, in _summarize_descriptions\n resul
t = await self._summarize_descriptions_with_llm(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/alex/venvs/rag_test/lib/python3.11/site-packages/graphrag/index/gr
aph/extractors/summarize/description_summary_extractor.py\", line 125, in _summarize_descriptions_with_llm\n response = await self._llm(\n ^^^^^^^^^^^^^^^^\n File \"/home/a
lex/venvs/rag_test/lib/python3.11/site-packages/graphrag/llm/openai/json_parsing_llm.py\", line 34, in __call__\n result = await self._delegate(input, **kwargs)\n ^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/alex/venvs/rag_test/lib/python3.11/site-packages/graphrag/llm/openai/openai_token_replacing_llm.py\", line 37, in __call__\n return await self._d
elegate(input, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/alex/venvs/rag_test/lib/python3.11/site-packages/graphrag/llm/openai/openai_history_tracking_llm.py
\", line 33, in __call__\n output = await self._delegate(input, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/alex/venvs/rag_test/lib/python3.11/site-packa
ges/graphrag/llm/base/caching_llm.py\", line 96, in __call__\n result = await self._delegate(input, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/alex/venv
s/rag_test/lib/python3.11/site-packages/graphrag/llm/base/rate_limiting_llm.py\", line 177, in __call__\n result, start = await execute_with_retry()\n ^^^^^^^^^^^^^^^^^
^^^^^^^^^\n File \"/home/alex/venvs/rag_test/lib/python3.11/site-packages/graphrag/llm/base/rate_limiting_llm.py\", line 159, in execute_with_retry\n async for attempt in retryer:\n Fil
e \"/home/alex/venvs/rag_test/lib/python3.11/site-packages/tenacity/asyncio/__init__.py\", line 166, in __anext__\n do = await self.iter(retry_state=self._retry_state)\n ^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/alex/venvs/rag_test/lib/python3.11/site-packages/tenacity/asyncio/__init__.py\", line 153, in iter\n result = await action(retry_stat
e)\n ^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/alex/venvs/rag_test/lib/python3.11/site-packages/tenacity/_utils.py\", line 99, in inner\n return call(*args, **kwargs)\n
^^^^^^^^^^^^^^^^^^^^^\n File \"/home/alex/venvs/rag_test/lib/python3.11/site-packages/tenacity/__init__.py\", line 418, in exc_check\n raise retry_exc.reraise()\n ^^^^^^^^^^
^^^^^^^^^\n File \"/home/alex/venvs/rag_test/lib/python3.11/site-packages/tenacity/__init__.py\", line 185, in reraise\n raise self.last_attempt.result()\n ^^^^^^^^^^^^^^^^^^^^^
^^^^^\n File \"/usr/lib/python3.11/concurrent/futures/_base.py\", line 449, in result\n return self.__get_result()\n ^^^^^^^^^^^^^^^^^^^\n File \"/usr/lib/python3.11/concurren
t/futures/_base.py\", line 401, in __get_result\n raise self._exception\n File \"/home/alex/venvs/rag_test/lib/python3.11/site-packages/graphrag/llm/base/rate_limiting_llm.py\", line 165
, in execute_with_retry\n return await do_attempt(), start\n ^^^^^^^^^^^^^^^^^^\n File \"/home/alex/venvs/rag_test/lib/python3.11/site-packages/graphrag/llm/base/rate_limiting_
llm.py\", line 151, in do_attempt\n await sleep_for(sleep_time)\n File \"/home/alex/venvs/rag_test/lib/python3.11/site-packages/graphrag/llm/base/rate_limiting_llm.py\", line 147, in do_
attempt\n return await self._delegate(input, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/alex/venvs/rag_test/lib/python3.11/site-packages/graphrag/llm/base
/base_llm.py\", line 49, in __call__\n return await self._invoke(input, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/alex/venvs/rag_test/lib/python3.11/site-p
ackages/graphrag/llm/base/base_llm.py\", line 53, in _invoke\n output = await self._execute_llm(input, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/ale
x/venvs/rag_test/lib/python3.11/site-packages/graphrag/llm/openai/openai_chat_llm.py\", line 53, in _execute_llm\n completion = await self.client.chat.completions.create(\n
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/alex/venvs/rag_test/lib/python3.11/site-packages/openai/resources/chat/completions.py\", line 1339, in create\n return await s
elf._post(\n ^^^^^^^^^^^^^^^^^\n File \"/home/alex/venvs/rag_test/lib/python3.11/site-packages/openai/_base_client.py\", line 1816, in post\n return await self.request(cast_to,
opts, stream=stream, stream_cls=stream_cls)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/home/alex/venvs/rag_test/lib/python3.11/site-packa
ges/openai/_base_client.py\", line 1510, in request\n return await self._request(\n ^^^^^^^^^^^^^^^^^^^^\n File \"/home/alex/venvs/rag_test/lib/python3.11/site-packages/openai/
_base_client.py\", line 1611, in _request\n raise self._make_status_error_from_response(err.response) from None\nopenai.RateLimitError: Error code: 429 - {'error': {'code': '429', 'messag
e': 'Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2024-02-15-preview have exceeded token rate limit of your current OpenAI S0 pricing tier. Please retry af
ter 28 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.'}}\n", "source": "Error code: 429 - {'error': {'code': '429', '
message': 'Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2024-02-15-preview have exceeded token rate limit of your current OpenAI S0 pricing tier. Please re
try after 28 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.'}}", "details": null}
I'm guessing that developers designed things to match the rate of their microsoft subscription, not other teirs.
Do you need to file an issue?
Describe the bug
python -m graphrag.query --root ./ragtest --method local "What drugs were given to the patient?"
Steps to reproduce
follow Getting Started guide, except with my own data. I reach third step, which is runnig your first query (after constructing the graph) as indicated above.
Expected Behavior
No response
GraphRAG Config Used
Logs and screenshots
No response
Additional Information