chatchat-space / Langchain-Chatchat

Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain
Apache License 2.0
31.2k stars 5.45k forks source link

文件向量化有点慢了,我想知道如何替换我的Embedding模型 或者加速 #3325

Closed 302658980 closed 4 months ago

302658980 commented 6 months ago

import os



如果模型目录名称和 MODEL_PATH 中的 key 或 value 相同,程序会自动检测加载,无需修改 MODEL_PATH 中的路径。


选用的 Embedding 名称

EMBEDDING_MODEL = "bge-large-zh"

Embedding 模型运行设备。设为"auto"会自动检测,也可手动设定为"cuda","mps","cpu"其中之一。


如果需要在 EMBEDDING_MODEL 中增加自定义的关键字时配置


要运行的 LLM 名称,可以包括本地模型和在线模型。列表中本地模型将在启动项目时全部加载。

列表中第一个模型将作为 API 和 WEBUI 的默认模型。

在这里,我们使用目前主流的两个离线模型,其中,chatglm3-6b 为默认加载模型。

如果你的显存不足,可使用 Qwen-1_8B-Chat, 该模型 FP16 仅需 3.8G显存。

LLM_MODELS = ["qwen-14b", "yi-34b"] # "Qwen-1_8B-Chat",

AgentLM模型的名称 (可以不指定,指定之后就锁定进入Agent之后的Chain的模型,不指定就是LLM_MODELS[0])

Agent_MODEL = None

LLM 运行设备。设为"auto"会自动检测,也可手动设定为"cuda","mps","cpu"其中之一。

LLM_DEVICE = "auto"







TOP_P = 0.95 # ChatOpenAI暂不支持该参数



"qwen-14b": {
    "model_name": "gpt-3.5-turbo",
    "api_base_url": "",
    "api_key": "none",
    "openai_proxy": "",
"yi-34b": {
    "model_name": "gpt-3.5-turbo",
    "api_base_url": "",
    "api_key": "none",
    "openai_proxy": "",

# 具体注册及api key获取请前往
"zhipu-api": {
    "api_key": "",
    "version": "chatglm_turbo",  # 可选包括 "chatglm_turbo"
    "provider": "ChatGLMWorker",

# 具体注册及api key获取请前往
"minimax-api": {
    "group_id": "",
    "api_key": "",
    "is_pro": False,
    "provider": "MiniMaxWorker",

# 具体注册及api key获取请前往
"xinghuo-api": {
    "APPID": "",
    "APISecret": "",
    "api_key": "",
    "version": "v1.5",  # 你使用的讯飞星火大模型版本,可选包括 "v3.0", "v1.5", "v2.0"
    "provider": "XingHuoWorker",

# 百度千帆 API,申请方式请参考
"qianfan-api": {
    "version": "ERNIE-Bot",  # 注意大小写。当前支持 "ERNIE-Bot" 或 "ERNIE-Bot-turbo", 更多的见官方文档。
    "version_url": "",  # 也可以不填写version,直接填写在千帆申请模型发布的API地址
    "api_key": "",
    "secret_key": "",
    "provider": "QianFanWorker",

# 火山方舟 API,文档参考
"fangzhou-api": {
    "version": "chatglm-6b-model",  # 当前支持 "chatglm-6b-model", 更多的见文档模型支持列表中方舟部分。
    "version_url": "",  # 可以不填写version,直接填写在方舟申请模型发布的API地址
    "api_key": "",
    "secret_key": "",
    "provider": "FangZhouWorker",

# 阿里云通义千问 API,文档参考
"qwen-api": {
    "version": "qwen-turbo",  # 可选包括 "qwen-turbo", "qwen-plus"
    "api_key": "",  # 请在阿里云控制台模型服务灵积API-KEY管理页面创建
    "provider": "QwenWorker",

# 百川 API,申请方式请参考
"baichuan-api": {
    "version": "Baichuan2-53B",  # 当前支持 "Baichuan2-53B", 见官方文档。
    "api_key": "",
    "secret_key": "",
    "provider": "BaiChuanWorker",

# Azure API
"azure-api": {
    "deployment_name": "",  # 部署容器的名字
    "resource_name": "",  # https://{resource_name} 填写resource_name的部分,其他部分不要填写
    "api_version": "",  # API的版本,不是模型版本
    "api_key": "",
    "provider": "AzureWorker",

# 昆仑万维天工 API
"tiangong-api": {
    "version": "SkyChat-MegaVerse",
    "api_key": "",
    "secret_key": "",
    "provider": "TianGongWorker",




2、不修改此处的值(以 text2vec 为例):

2.1 如果{MODEL_ROOT_PATH}下存在如下任一子目录:

- text2vec

- GanymedeNil/text2vec-large-chinese

- text2vec-large-chinese

2.2 如果以上本地路径不存在,则使用huggingface模型

MODEL_PATH = { "embed_model": { "ernie-tiny": "nghuyong/ernie-3.0-nano-zh", "ernie-base": "nghuyong/ernie-3.0-base-zh", "text2vec-base": "shibing624/text2vec-base-chinese", "text2vec": "GanymedeNil/text2vec-large-chinese", "text2vec-paraphrase": "shibing624/text2vec-base-chinese-paraphrase", "text2vec-sentence": "shibing624/text2vec-base-chinese-sentence", "text2vec-multilingual": "shibing624/text2vec-base-multilingual", "text2vec-bge-large-chinese": "shibing624/text2vec-bge-large-chinese", "m3e-small": "moka-ai/m3e-small", "m3e-base": "moka-ai/m3e-base", "m3e-large": "moka-ai/m3e-large", "bge-small-zh": "BAAI/bge-small-zh", "bge-base-zh": "BAAI/bge-base-zh", "bge-large-zh": "/root/model/bge-large-zh", "bge-large-zh-noinstruct": "BAAI/bge-large-zh-noinstruct", "bge-base-zh-v1.5": "BAAI/bge-base-zh-v1.5", "bge-large-zh-v1.5": "BAAI/bge-large-zh-v1.5", "piccolo-base-zh": "sensenova/piccolo-base-zh", "piccolo-large-zh": "sensenova/piccolo-large-zh", "nlp_gte_sentence-embedding_chinese-large": "damo/nlp_gte_sentence-embedding_chinese-large", "text-embedding-ada-002": "your OPENAI_API_KEY", },

"llm_model": {
    # 以下部分模型并未完全测试,仅根据fastchat和vllm模型的模型列表推定支持
    "chatglm2-6b": "/root/model/chatglm2-6b",
    "chatglm2-6b-32k": "THUDM/chatglm2-6b-32k",

    "chatglm3-6b": "/root/model/chatglm3-6b",
    "chatglm3-6b-32k": "THUDM/chatglm3-6b-32k",
    "chatglm3-6b-base": "THUDM/chatglm3-6b-base",

    "Qwen-1_8B": "Qwen/Qwen-1_8B",
    # "Qwen-1_8B-Chat": "/root/model/Qwen-1_8B-Chat",
    "Qwen-1_8B-Chat-Int8": "Qwen/Qwen-1_8B-Chat-Int8",
    "Qwen-1_8B-Chat-Int4": "Qwen/Qwen-1_8B-Chat-Int4",

    "Qwen-7B": "Qwen/Qwen-7B",
    "Qwen-7B-Chat": "Qwen/Qwen-7B-Chat",

    "Qwen-14B": "Qwen/Qwen-14B",
    "Qwen-14B-Chat": "Qwen/Qwen-14B-Chat",
    "Qwen-14B-Chat-Int8": "Qwen/Qwen-14B-Chat-Int8",
    "Qwen-14B-Chat-Int4": "Qwen/Qwen-14B-Chat-Int4",

    "Qwen-72B": "Qwen/Qwen-72B",
    "Qwen-72B-Chat": "Qwen/Qwen-72B-Chat",
    "Qwen-72B-Chat-Int8": "Qwen/Qwen-72B-Chat-Int8",
    "Qwen-72B-Chat-Int4": "Qwen/Qwen-72B-Chat-Int4",

    "baichuan2-13b": "baichuan-inc/Baichuan2-13B-Chat",
    "baichuan2-7b": "baichuan-inc/Baichuan2-7B-Chat",

    "baichuan-7b": "baichuan-inc/Baichuan-7B",
    "baichuan-13b": "baichuan-inc/Baichuan-13B",
    "baichuan-13b-chat": "baichuan-inc/Baichuan-13B-Chat",

    "aquila-7b": "BAAI/Aquila-7B",
    "aquilachat-7b": "BAAI/AquilaChat-7B",

    "internlm-7b": "internlm/internlm-7b",
    "internlm-chat-7b": "internlm/internlm-chat-7b",

    "falcon-7b": "tiiuae/falcon-7b",
    "falcon-40b": "tiiuae/falcon-40b",
    "falcon-rw-7b": "tiiuae/falcon-rw-7b",

    "gpt2": "gpt2",
    "gpt2-xl": "gpt2-xl",

    "gpt-j-6b": "EleutherAI/gpt-j-6b",
    "gpt4all-j": "nomic-ai/gpt4all-j",
    "gpt-neox-20b": "EleutherAI/gpt-neox-20b",
    "pythia-12b": "EleutherAI/pythia-12b",
    "oasst-sft-4-pythia-12b-epoch-3.5": "OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5",
    "dolly-v2-12b": "databricks/dolly-v2-12b",
    "stablelm-tuned-alpha-7b": "stabilityai/stablelm-tuned-alpha-7b",

    "Llama-2-13b-hf": "meta-llama/Llama-2-13b-hf",
    "Llama-2-70b-hf": "meta-llama/Llama-2-70b-hf",
    "open_llama_13b": "openlm-research/open_llama_13b",
    "vicuna-13b-v1.3": "lmsys/vicuna-13b-v1.3",
    "koala": "young-geng/koala",

    "mpt-7b": "mosaicml/mpt-7b",
    "mpt-7b-storywriter": "mosaicml/mpt-7b-storywriter",
    "mpt-30b": "mosaicml/mpt-30b",
    "opt-66b": "facebook/opt-66b",
    "opt-iml-max-30b": "facebook/opt-iml-max-30b",

    "agentlm-7b": "THUDM/agentlm-7b",
    "agentlm-13b": "THUDM/agentlm-13b",
    "agentlm-70b": "THUDM/agentlm-70b",

    "Yi-34B-Chat": "",



nltk 模型存储路径

NLTK_DATA_PATH = os.path.join(os.path.dirname(os.path.dirname(file)), "nltk_data")

VLLM_MODEL_DICT = { "aquila-7b": "BAAI/Aquila-7B", "aquilachat-7b": "BAAI/AquilaChat-7B",

"baichuan-7b": "baichuan-inc/Baichuan-7B",
"baichuan-13b": "baichuan-inc/Baichuan-13B",
"baichuan-13b-chat": "baichuan-inc/Baichuan-13B-Chat",

"chatglm2-6b": "THUDM/chatglm2-6b",
"chatglm2-6b-32k": "THUDM/chatglm2-6b-32k",
# "chatglm3-6b": "THUDM/chatglm3-6b",
# "chatglm3-6b-32k": "THUDM/chatglm3-6b-32k",

"BlueLM-7B-Chat": "vivo-ai/BlueLM-7B-Chat",
"BlueLM-7B-Chat-32k": "vivo-ai/BlueLM-7B-Chat-32k",

# 注意:bloom系列的tokenizer与model是分离的,因此虽然vllm支持,但与fschat框架不兼容
# "bloom": "bigscience/bloom",
# "bloomz": "bigscience/bloomz",
# "bloomz-560m": "bigscience/bloomz-560m",
# "bloomz-7b1": "bigscience/bloomz-7b1",
# "bloomz-1b7": "bigscience/bloomz-1b7",

"internlm-7b": "internlm/internlm-7b",
"internlm-chat-7b": "internlm/internlm-chat-7b",
"falcon-7b": "tiiuae/falcon-7b",
"falcon-40b": "tiiuae/falcon-40b",
"falcon-rw-7b": "tiiuae/falcon-rw-7b",
"gpt2": "gpt2",
"gpt2-xl": "gpt2-xl",
"gpt-j-6b": "EleutherAI/gpt-j-6b",
"gpt4all-j": "nomic-ai/gpt4all-j",
"gpt-neox-20b": "EleutherAI/gpt-neox-20b",
"pythia-12b": "EleutherAI/pythia-12b",
"oasst-sft-4-pythia-12b-epoch-3.5": "OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5",
"dolly-v2-12b": "databricks/dolly-v2-12b",
"stablelm-tuned-alpha-7b": "stabilityai/stablelm-tuned-alpha-7b",
"Llama-2-13b-hf": "meta-llama/Llama-2-13b-hf",
"Llama-2-70b-hf": "meta-llama/Llama-2-70b-hf",
"open_llama_13b": "openlm-research/open_llama_13b",
"vicuna-13b-v1.3": "lmsys/vicuna-13b-v1.3",
"koala": "young-geng/koala",
"mpt-7b": "mosaicml/mpt-7b",
"mpt-7b-storywriter": "mosaicml/mpt-7b-storywriter",
"mpt-30b": "mosaicml/mpt-30b",
"opt-66b": "facebook/opt-66b",
"opt-iml-max-30b": "facebook/opt-iml-max-30b",

"Qwen-1_8B": "Qwen/Qwen-1_8B",
# "Qwen-1_8B-Chat": "Qwen/Qwen-1_8B-Chat",
"Qwen-1_8B-Chat-Int8": "Qwen/Qwen-1_8B-Chat-Int8",
"Qwen-1_8B-Chat-Int4": "Qwen/Qwen-1_8B-Chat-Int4",

"Qwen-7B": "Qwen/Qwen-7B",
"Qwen-7B-Chat": "Qwen/Qwen-7B-Chat",

"Qwen-14B": "Qwen/Qwen-14B",
"Qwen-14B-Chat": "Qwen/Qwen-14B-Chat",
"Qwen-14B-Chat-Int8": "Qwen/Qwen-14B-Chat-Int8",
"Qwen-14B-Chat-Int4": "Qwen/Qwen-14B-Chat-Int4",

"Qwen-72B": "Qwen/Qwen-72B",
"Qwen-72B-Chat": "Qwen/Qwen-72B-Chat",
"Qwen-72B-Chat-Int8": "Qwen/Qwen-72B-Chat-Int8",
"Qwen-72B-Chat-Int4": "Qwen/Qwen-72B-Chat-Int4",

"agentlm-7b": "THUDM/agentlm-7b",
"agentlm-13b": "THUDM/agentlm-13b",
"agentlm-70b": "THUDM/agentlm-70b",




SUPPORT_AGENT_MODEL = [ "azure-api", "openai-api", "qwen-api", "Qwen", "chatglm3", "xinghuo-api", ] 要修改上面配置文件的哪些内容

dosubot[bot] commented 6 months ago

Hey there, @302658980! Long time no chat. Hope you're doing awesomely! 🚀

To replace or speed up your current "bge-large-zh" Embedding model located at "/root/model/bge-large-zh" in the MODEL_PATH dictionary, you can either update the path to a new model or choose a different model that might be faster or more efficient. Here's how you can do it based on the provided configuration file:

  1. Replace with a New Model:

    • If you have a new model that you believe is faster or more suitable, you can simply change the path of the "bge-large-zh" key in the MODEL_PATH dictionary to point to the new model's location. For example, if the new model is located at /new/model/path, update the dictionary as follows:
      "bge-large-zh": "/new/model/path",
    • Ensure the new model is compatible with the rest of your setup.
  2. Choose a Different Pre-configured Model:

    • If you want to use a different pre-configured model that is expected to be faster, you can change the EMBEDDING_MODEL variable to match the key of the desired model in the MODEL_PATH dictionary. For example, if you want to switch to "bge-base-zh-v1.5" which might be lighter and faster, update the EMBEDDING_MODEL as follows:
      EMBEDDING_MODEL = "bge-base-zh-v1.5"
    • This change assumes that "bge-base-zh-v1.5" is a suitable replacement for your needs. You might need to consider the trade-offs in model performance and speed.
  3. Configuration Adjustments:

    • MODEL_ROOT_PATH: If your new model is located in a directory that is a child of a common root directory, you can set MODEL_ROOT_PATH to this common root directory. The program will automatically detect and load the model if its directory name matches the EMBEDDING_MODEL or the keys/values in MODEL_PATH.
    • EMBEDDING_DEVICE: Ensure this is set to the optimal device for running your model. For example, if you have a GPU available, setting it to "cuda" might speed up the model's performance.

Remember, when making changes, ensure that the new model is compatible with your application's requirements and that you have tested the model's performance to ensure it meets your needs.

The compatible models that can be used as a replacement for the "bge-large-zh" Embedding model in the Langchain-Chatchat codebase, as per the provided context, include:

  1. "bge-small-zh": "BAAI/bge-small-zh"
  2. "bge-base-zh": "BAAI/bge-base-zh"
  3. "bge-large-zh-noinstruct": "BAAI/bge-large-zh-noinstruct"
  4. "bge-base-zh-v1.5": "BAAI/bge-base-zh-v1.5"
  5. "bge-large-zh-v1.5": "BAAI/bge-large-zh-v1.5"
  6. "bge-m3": "BAAI/bge-m3"

These models are listed under the "embed_model" section in the provided configuration file, indicating they are intended for embedding purposes similar to "bge-large-zh".

Unfortunately, I wasn't able to find specific performance differences between the "bge-large-zh" Embedding model and the other compatible models listed in the Langchain-Chatchat codebase within the repository. You might need to conduct some performance tests to determine which model best suits your needs.

For more details, you can refer to the model configuration example in the repository.


#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
302658980 commented 6 months ago



  1. 更换新型号:

    • 如果您有一个您认为更快或更合适的新模型,您只需更改字典"bge-large-zh"中键的路径MODEL_PATH以指向新模型的位置即可。例如,如果新模型位于/new/model/path,则按如下方式更新字典:
      "bge-large-zh": "/new/model/path",
    • 确保新模型与您的其他设置兼容。
  2. 选择不同的预配置模型:

    • 如果您想使用预计速度更快的不同预配置模型,您可以更改变量EMBEDDING_MODEL以匹配字典中所需模型的键MODEL_PATH。例如,如果您想切换到"bge-base-zh-v1.5"更轻、更快的版本,请更新EMBEDDING_MODEL如下:
      EMBEDDING_MODEL = "bge-base-zh-v1.5"
    • 此更改假定它"bge-base-zh-v1.5"是满足您需求的合适替代品。您可能需要考虑模型性能和速度的权衡。
  3. 配置调整:

    • MODEL_ROOT_PATH:如果您的新模型位于公共根目录的子目录中,则可以设置MODEL_ROOT_PATH为该公共根目录。EMBEDDING_MODEL如果模型的目录名称与或 中的键/值匹配,程序将自动检测并加载模型MODEL_PATH
    • EMBEDDING_DEVICE:确保将其设置为运行模型的最佳设备。例如,如果您有可用的 GPU,将其设置为"cuda"可能会加快模型的性能。


根据提供的上下文,可用作 Langchain-Chachat 代码库中“bge-large-zh”嵌入模型的替代的兼容模型包括:

  1. "bge-small-zh": "BAAI/bge-small-zh"
  2. "bge-base-zh": "BAAI/bge-base-zh"
  3. "bge-large-zh-noinstruct": "BAAI/bge-large-zh-noinstruct"
  4. "bge-base-zh-v1.5": "BAAI/bge-base-zh-v1.5"
  5. "bge-large-zh-v1.5": "BAAI/bge-large-zh-v1.5"
  6. "bge-m3": "BAAI/bge-m3"


不幸的是,我无法找到“bge-large-zh”嵌入模型与存储库中 Langchain-Chachat 代码库中列出的其他兼容模型之间的具体性能差异。您可能需要进行一些性能测试以确定哪种模型最适合您的需求。





帮助我变得更有用!如果这有帮助,请留下 👍;如果不相关,请留下 👎。


现在我有一个更快的embendding 链接是 我该如何修改呢

dosubot[bot] commented 6 months ago

To replace the current "bge-large-zh" Embedding model located at "/root/model/bge-large-zh" in the MODEL_PATH dictionary with a new model available at "", you need to update the MODEL_PATH dictionary in the configs/ file. Here's how you can do it:

MODEL_PATH["bge-large-zh"] = ""

This line of code should be placed where the MODEL_PATH dictionary is initialized or updated in your application.

However, there are several performance considerations and compatibility issues you should be aware of:

  1. Network Latency: The new model is hosted online, which means every request to this model will involve network latency. This could significantly affect the performance if your application relies on real-time processing.

  2. Availability and Reliability: Depending on an external service introduces a point of failure. If the service hosting the new model goes down or experiences issues, your application will be directly affected.

  3. Data Privacy and Security: Sending data to an external service might raise privacy and security concerns, especially if the data is sensitive. Ensure the service complies with your data handling policies.

  4. Model Compatibility: Ensure the new model's input and output formats are compatible with your current system. Any difference in expected input format, output format, or even the model's performance characteristics might require adjustments in your code.

  5. Rate Limiting: External services often have rate limits. Ensure that your usage of the new model complies with any such limits to avoid service interruptions.

  6. Cost: Using an external service might incur costs based on the number of requests, data transfer, or other factors. It's important to understand and plan for these costs.

You can find more details about how the MODEL_PATH dictionary is used in the Langchain-Chatchat codebase in the configs/ file.

Please test the integration thoroughly in a development environment before deploying changes to production.


#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
302658980 commented 6 months ago

将字典中位于“/root/model/bge-large-zh”的当前“bge-large-zh”嵌入模型替换MODEL_PATH为“”上提供的新模型/ embed ",需要更新文件MODEL_PATH中的字典configs/。您可以这么做:

MODEL_PATH["bge-large-zh"] = ""



  1. 网络延迟:新模型在线托管,这意味着模型的每个请求都会涉及网络延迟。如果您的应用程序依赖于实时处理,这可能会显着影响性能。
  2. 可用性和可靠性:依赖外部服务会引入故障点。如果托管新模型的服务出现故障或遇到问题,您的应用程序将直接受到影响。
  3. 数据隐私和安全:将数据发送到外部服务可能会引起隐私和安全问题,特别是在数据敏感的情况下。确保服务符合您的数据处理政策。
  4. 模型兼容性:确保新模型的输入和输出格式与您当前的系统兼容。预期输入格式、输出格式甚至模型性能特征的任何差异都可能需要对代码进行调整。
  5. 限制:外部服务有速率限制。确保您对新模型的使用符合任何此类速率限制,小区服务通常会中断。
  6. 成本:使用外部服务可能会产生基于请求数量、数据传输或其他因素的成本。了解并计划这些成本非常重要。





在 model_config 里加上 MODEL_PATH["bge-large-zh"] = " " 不太对的样子

github-actions[bot] commented 5 months ago

这个问题已经被标记为 stale ,因为它已经超过 30 天没有任何活动。