Do you need to file an issue?

[ ] I have searched the existing issues and this bug is not already filed.
[ ] My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
[ ] I believe this is a legitimate bug, not just a question. If this is a question, please use the Discussions area.

Describe the issue

我的配置'''encoding_model: cl100k_base skip_workflows: [] llm: api_key: xinference type: openai_chat # or azure_openai_chat model: custom-glm4-chat model_supports_json: true # recommended if this is available for your model.

max_tokens: 4000

request_timeout: 180.0

api_base: http://0.0.0.0:9997/v1

api_version: 2024-02-15-preview

organization:

deployment_name:

tokens_per_minute: 150_000 # set a leaky bucket throttle

requests_per_minute: 10_000 # set a leaky bucket throttle

max_retries: 10

max_retry_wait: 10.0

sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times

concurrent_requests: 25 # the number of parallel inflight requests that may be made

temperature: 0.4 # temperature for sampling

top_p: 1 # top-p sampling

n: 1 # Number of completions to generate

'''

Steps to reproduce

No response

GraphRAG Config Used

# Paste your config here

Logs and screenshots

No response

Additional Information

GraphRAG Version:
Operating System:
Python Version:
Related Issues:

encoding_model: cl100k_base skip_workflows: [] llm: api_key: xinference type: openai_chat # or azure_openai_chat model: custom-glm4-chat model_supports_json: true # recommended if this is available for your model.

max_tokens: 4000

request_timeout: 180.0

api_base: http://0.0.0.0:9997/v1

api_version: 2024-02-15-preview

organization:

deployment_name:

tokens_per_minute: 150_000 # set a leaky bucket throttle

requests_per_minute: 10_000 # set a leaky bucket throttle

max_retries: 10

max_retry_wait: 10.0

sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times

concurrent_requests: 25 # the number of parallel inflight requests that may be made

temperature: 0.4 # temperature for sampling

top_p: 1 # top-p sampling

n: 1 # Number of completions to generate

microsoft / graphrag

Leave get_model, error: Model not found, uid: custom-glm4-chat-1-0 #1205

Do you need to file an issue?

Describe the issue

max_tokens: 4000

request_timeout: 180.0

api_version: 2024-02-15-preview

organization:

deployment_name:

tokens_per_minute: 150_000 # set a leaky bucket throttle

requests_per_minute: 10_000 # set a leaky bucket throttle

max_retries: 10

max_retry_wait: 10.0

sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times

concurrent_requests: 25 # the number of parallel inflight requests that may be made

top_p: 1 # top-p sampling

n: 1 # Number of completions to generate

Steps to reproduce

GraphRAG Config Used

Logs and screenshots

Additional Information

max_tokens: 4000

request_timeout: 180.0

api_version: 2024-02-15-preview

organization:

deployment_name:

tokens_per_minute: 150_000 # set a leaky bucket throttle

requests_per_minute: 10_000 # set a leaky bucket throttle

max_retries: 10

max_retry_wait: 10.0

sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times

concurrent_requests: 25 # the number of parallel inflight requests that may be made

top_p: 1 # top-p sampling

n: 1 # Number of completions to generate