microsoft / graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system
https://microsoft.github.io/graphrag/
MIT License
17.7k stars 1.69k forks source link

Leave get_model, error: Model not found, uid: custom-glm4-chat-1-0 #1205

Open shuifuture opened 6 days ago

shuifuture commented 6 days ago

Do you need to file an issue?

Describe the issue

我的配置'''encoding_model: cl100k_base skip_workflows: [] llm: api_key: xinference type: openai_chat # or azure_openai_chat model: custom-glm4-chat model_supports_json: true # recommended if this is available for your model.

max_tokens: 4000

request_timeout: 180.0

api_base: http://0.0.0.0:9997/v1

api_version: 2024-02-15-preview

organization:

deployment_name:

tokens_per_minute: 150_000 # set a leaky bucket throttle

requests_per_minute: 10_000 # set a leaky bucket throttle

max_retries: 10

max_retry_wait: 10.0

sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times

concurrent_requests: 25 # the number of parallel inflight requests that may be made

temperature: 0.4 # temperature for sampling

top_p: 1 # top-p sampling

n: 1 # Number of completions to generate

'''

Steps to reproduce

No response

GraphRAG Config Used

# Paste your config here

Logs and screenshots

No response

Additional Information

shuifuture commented 6 days ago

encoding_model: cl100k_base skip_workflows: [] llm: api_key: xinference type: openai_chat # or azure_openai_chat model: custom-glm4-chat model_supports_json: true # recommended if this is available for your model.

max_tokens: 4000

request_timeout: 180.0

api_base: http://0.0.0.0:9997/v1

api_version: 2024-02-15-preview

organization:

deployment_name:

tokens_per_minute: 150_000 # set a leaky bucket throttle

requests_per_minute: 10_000 # set a leaky bucket throttle

max_retries: 10

max_retry_wait: 10.0

sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times

concurrent_requests: 25 # the number of parallel inflight requests that may be made

temperature: 0.4 # temperature for sampling

top_p: 1 # top-p sampling

n: 1 # Number of completions to generate