chatchat-space / Langchain-Chatchat

Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain
Apache License 2.0
31.2k stars 5.45k forks source link

文件向量化有点慢了,我想知道如何替换我的Embedding模型 或者加速 #3325

Closed 302658980 closed 4 months ago

302658980 commented 6 months ago

import os

可以指定一个绝对路径,统一存放所有的Embedding和LLM模型。

每个模型可以是一个单独的目录,也可以是某个目录下的二级子目录。

如果模型目录名称和 MODEL_PATH 中的 key 或 value 相同,程序会自动检测加载,无需修改 MODEL_PATH 中的路径。

MODEL_ROOT_PATH = ""

选用的 Embedding 名称

EMBEDDING_MODEL = "bge-large-zh"

Embedding 模型运行设备。设为"auto"会自动检测,也可手动设定为"cuda","mps","cpu"其中之一。

EMBEDDING_DEVICE = "auto"

如果需要在 EMBEDDING_MODEL 中增加自定义的关键字时配置

EMBEDDING_KEYWORD_FILE = "keywords.txt" EMBEDDING_MODEL_OUTPUT_PATH = "output"

要运行的 LLM 名称,可以包括本地模型和在线模型。列表中本地模型将在启动项目时全部加载。

列表中第一个模型将作为 API 和 WEBUI 的默认模型。

在这里,我们使用目前主流的两个离线模型,其中,chatglm3-6b 为默认加载模型。

如果你的显存不足,可使用 Qwen-1_8B-Chat, 该模型 FP16 仅需 3.8G显存。

LLM_MODELS = ["qwen-14b", "yi-34b"] # "Qwen-1_8B-Chat",

AgentLM模型的名称 (可以不指定,指定之后就锁定进入Agent之后的Chain的模型,不指定就是LLM_MODELS[0])

Agent_MODEL = None

LLM 运行设备。设为"auto"会自动检测,也可手动设定为"cuda","mps","cpu"其中之一。

LLM_DEVICE = "auto"

历史对话轮数

HISTORY_LEN = 3

大模型最长支持的长度,如果不填写,则使用模型默认的最大长度,如果填写,则为用户设定的最大长度

MAX_TOKENS = None

LLM通用对话参数

TEMPERATURE = 0.7

TOP_P = 0.95 # ChatOpenAI暂不支持该参数

ONLINE_LLM_MODEL = {

线上模型。请在server_config中为每个在线API设置不同的端口

"qwen-14b": {
    "model_name": "gpt-3.5-turbo",
    "api_base_url": "http://region-9.seetacloud.com:25540/v1",
    "api_key": "none",
    "openai_proxy": "",
},
"yi-34b": {
    "model_name": "gpt-3.5-turbo",
    "api_base_url": "http://region-9.seetacloud.com:19656/v1",
    "api_key": "none",
    "openai_proxy": "",
},

# 具体注册及api key获取请前往 http://open.bigmodel.cn
"zhipu-api": {
    "api_key": "",
    "version": "chatglm_turbo",  # 可选包括 "chatglm_turbo"
    "provider": "ChatGLMWorker",
},

# 具体注册及api key获取请前往 https://api.minimax.chat/
"minimax-api": {
    "group_id": "",
    "api_key": "",
    "is_pro": False,
    "provider": "MiniMaxWorker",
},

# 具体注册及api key获取请前往 https://xinghuo.xfyun.cn/
"xinghuo-api": {
    "APPID": "",
    "APISecret": "",
    "api_key": "",
    "version": "v1.5",  # 你使用的讯飞星火大模型版本,可选包括 "v3.0", "v1.5", "v2.0"
    "provider": "XingHuoWorker",
},

# 百度千帆 API,申请方式请参考 https://cloud.baidu.com/doc/WENXINWORKSHOP/s/4lilb2lpf
"qianfan-api": {
    "version": "ERNIE-Bot",  # 注意大小写。当前支持 "ERNIE-Bot" 或 "ERNIE-Bot-turbo", 更多的见官方文档。
    "version_url": "",  # 也可以不填写version,直接填写在千帆申请模型发布的API地址
    "api_key": "",
    "secret_key": "",
    "provider": "QianFanWorker",
},

# 火山方舟 API,文档参考 https://www.volcengine.com/docs/82379
"fangzhou-api": {
    "version": "chatglm-6b-model",  # 当前支持 "chatglm-6b-model", 更多的见文档模型支持列表中方舟部分。
    "version_url": "",  # 可以不填写version,直接填写在方舟申请模型发布的API地址
    "api_key": "",
    "secret_key": "",
    "provider": "FangZhouWorker",
},

# 阿里云通义千问 API,文档参考 https://help.aliyun.com/zh/dashscope/developer-reference/api-details
"qwen-api": {
    "version": "qwen-turbo",  # 可选包括 "qwen-turbo", "qwen-plus"
    "api_key": "",  # 请在阿里云控制台模型服务灵积API-KEY管理页面创建
    "provider": "QwenWorker",
},

# 百川 API,申请方式请参考 https://www.baichuan-ai.com/home#api-enter
"baichuan-api": {
    "version": "Baichuan2-53B",  # 当前支持 "Baichuan2-53B", 见官方文档。
    "api_key": "",
    "secret_key": "",
    "provider": "BaiChuanWorker",
},

# Azure API
"azure-api": {
    "deployment_name": "",  # 部署容器的名字
    "resource_name": "",  # https://{resource_name}.openai.azure.com/openai/ 填写resource_name的部分,其他部分不要填写
    "api_version": "",  # API的版本,不是模型版本
    "api_key": "",
    "provider": "AzureWorker",
},

# 昆仑万维天工 API https://model-platform.tiangong.cn/
"tiangong-api": {
    "version": "SkyChat-MegaVerse",
    "api_key": "",
    "secret_key": "",
    "provider": "TianGongWorker",
},

}

在以下字典中修改属性值,以指定本地embedding模型存储位置。支持3种设置方法:

1、将对应的值修改为模型绝对路径

2、不修改此处的值(以 text2vec 为例):

2.1 如果{MODEL_ROOT_PATH}下存在如下任一子目录:

- text2vec

- GanymedeNil/text2vec-large-chinese

- text2vec-large-chinese

2.2 如果以上本地路径不存在,则使用huggingface模型

MODEL_PATH = { "embed_model": { "ernie-tiny": "nghuyong/ernie-3.0-nano-zh", "ernie-base": "nghuyong/ernie-3.0-base-zh", "text2vec-base": "shibing624/text2vec-base-chinese", "text2vec": "GanymedeNil/text2vec-large-chinese", "text2vec-paraphrase": "shibing624/text2vec-base-chinese-paraphrase", "text2vec-sentence": "shibing624/text2vec-base-chinese-sentence", "text2vec-multilingual": "shibing624/text2vec-base-multilingual", "text2vec-bge-large-chinese": "shibing624/text2vec-bge-large-chinese", "m3e-small": "moka-ai/m3e-small", "m3e-base": "moka-ai/m3e-base", "m3e-large": "moka-ai/m3e-large", "bge-small-zh": "BAAI/bge-small-zh", "bge-base-zh": "BAAI/bge-base-zh", "bge-large-zh": "/root/model/bge-large-zh", "bge-large-zh-noinstruct": "BAAI/bge-large-zh-noinstruct", "bge-base-zh-v1.5": "BAAI/bge-base-zh-v1.5", "bge-large-zh-v1.5": "BAAI/bge-large-zh-v1.5", "piccolo-base-zh": "sensenova/piccolo-base-zh", "piccolo-large-zh": "sensenova/piccolo-large-zh", "nlp_gte_sentence-embedding_chinese-large": "damo/nlp_gte_sentence-embedding_chinese-large", "text-embedding-ada-002": "your OPENAI_API_KEY", },

"llm_model": {
    # 以下部分模型并未完全测试,仅根据fastchat和vllm模型的模型列表推定支持
    "chatglm2-6b": "/root/model/chatglm2-6b",
    "chatglm2-6b-32k": "THUDM/chatglm2-6b-32k",

    "chatglm3-6b": "/root/model/chatglm3-6b",
    "chatglm3-6b-32k": "THUDM/chatglm3-6b-32k",
    "chatglm3-6b-base": "THUDM/chatglm3-6b-base",

    "Qwen-1_8B": "Qwen/Qwen-1_8B",
    # "Qwen-1_8B-Chat": "/root/model/Qwen-1_8B-Chat",
    "Qwen-1_8B-Chat-Int8": "Qwen/Qwen-1_8B-Chat-Int8",
    "Qwen-1_8B-Chat-Int4": "Qwen/Qwen-1_8B-Chat-Int4",

    "Qwen-7B": "Qwen/Qwen-7B",
    "Qwen-7B-Chat": "Qwen/Qwen-7B-Chat",

    "Qwen-14B": "Qwen/Qwen-14B",
    "Qwen-14B-Chat": "Qwen/Qwen-14B-Chat",
    "Qwen-14B-Chat-Int8": "Qwen/Qwen-14B-Chat-Int8",
    "Qwen-14B-Chat-Int4": "Qwen/Qwen-14B-Chat-Int4",

    "Qwen-72B": "Qwen/Qwen-72B",
    "Qwen-72B-Chat": "Qwen/Qwen-72B-Chat",
    "Qwen-72B-Chat-Int8": "Qwen/Qwen-72B-Chat-Int8",
    "Qwen-72B-Chat-Int4": "Qwen/Qwen-72B-Chat-Int4",

    "baichuan2-13b": "baichuan-inc/Baichuan2-13B-Chat",
    "baichuan2-7b": "baichuan-inc/Baichuan2-7B-Chat",

    "baichuan-7b": "baichuan-inc/Baichuan-7B",
    "baichuan-13b": "baichuan-inc/Baichuan-13B",
    "baichuan-13b-chat": "baichuan-inc/Baichuan-13B-Chat",

    "aquila-7b": "BAAI/Aquila-7B",
    "aquilachat-7b": "BAAI/AquilaChat-7B",

    "internlm-7b": "internlm/internlm-7b",
    "internlm-chat-7b": "internlm/internlm-chat-7b",

    "falcon-7b": "tiiuae/falcon-7b",
    "falcon-40b": "tiiuae/falcon-40b",
    "falcon-rw-7b": "tiiuae/falcon-rw-7b",

    "gpt2": "gpt2",
    "gpt2-xl": "gpt2-xl",

    "gpt-j-6b": "EleutherAI/gpt-j-6b",
    "gpt4all-j": "nomic-ai/gpt4all-j",
    "gpt-neox-20b": "EleutherAI/gpt-neox-20b",
    "pythia-12b": "EleutherAI/pythia-12b",
    "oasst-sft-4-pythia-12b-epoch-3.5": "OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5",
    "dolly-v2-12b": "databricks/dolly-v2-12b",
    "stablelm-tuned-alpha-7b": "stabilityai/stablelm-tuned-alpha-7b",

    "Llama-2-13b-hf": "meta-llama/Llama-2-13b-hf",
    "Llama-2-70b-hf": "meta-llama/Llama-2-70b-hf",
    "open_llama_13b": "openlm-research/open_llama_13b",
    "vicuna-13b-v1.3": "lmsys/vicuna-13b-v1.3",
    "koala": "young-geng/koala",

    "mpt-7b": "mosaicml/mpt-7b",
    "mpt-7b-storywriter": "mosaicml/mpt-7b-storywriter",
    "mpt-30b": "mosaicml/mpt-30b",
    "opt-66b": "facebook/opt-66b",
    "opt-iml-max-30b": "facebook/opt-iml-max-30b",

    "agentlm-7b": "THUDM/agentlm-7b",
    "agentlm-13b": "THUDM/agentlm-13b",
    "agentlm-70b": "THUDM/agentlm-70b",

    "Yi-34B-Chat": "https://huggingface.co/01-ai/Yi-34B-Chat",
},

}

通常情况下不需要更改以下内容

nltk 模型存储路径

NLTK_DATA_PATH = os.path.join(os.path.dirname(os.path.dirname(file)), "nltk_data")

VLLM_MODEL_DICT = { "aquila-7b": "BAAI/Aquila-7B", "aquilachat-7b": "BAAI/AquilaChat-7B",

"baichuan-7b": "baichuan-inc/Baichuan-7B",
"baichuan-13b": "baichuan-inc/Baichuan-13B",
"baichuan-13b-chat": "baichuan-inc/Baichuan-13B-Chat",

"chatglm2-6b": "THUDM/chatglm2-6b",
"chatglm2-6b-32k": "THUDM/chatglm2-6b-32k",
# "chatglm3-6b": "THUDM/chatglm3-6b",
# "chatglm3-6b-32k": "THUDM/chatglm3-6b-32k",

"BlueLM-7B-Chat": "vivo-ai/BlueLM-7B-Chat",
"BlueLM-7B-Chat-32k": "vivo-ai/BlueLM-7B-Chat-32k",

# 注意:bloom系列的tokenizer与model是分离的,因此虽然vllm支持,但与fschat框架不兼容
# "bloom": "bigscience/bloom",
# "bloomz": "bigscience/bloomz",
# "bloomz-560m": "bigscience/bloomz-560m",
# "bloomz-7b1": "bigscience/bloomz-7b1",
# "bloomz-1b7": "bigscience/bloomz-1b7",

"internlm-7b": "internlm/internlm-7b",
"internlm-chat-7b": "internlm/internlm-chat-7b",
"falcon-7b": "tiiuae/falcon-7b",
"falcon-40b": "tiiuae/falcon-40b",
"falcon-rw-7b": "tiiuae/falcon-rw-7b",
"gpt2": "gpt2",
"gpt2-xl": "gpt2-xl",
"gpt-j-6b": "EleutherAI/gpt-j-6b",
"gpt4all-j": "nomic-ai/gpt4all-j",
"gpt-neox-20b": "EleutherAI/gpt-neox-20b",
"pythia-12b": "EleutherAI/pythia-12b",
"oasst-sft-4-pythia-12b-epoch-3.5": "OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5",
"dolly-v2-12b": "databricks/dolly-v2-12b",
"stablelm-tuned-alpha-7b": "stabilityai/stablelm-tuned-alpha-7b",
"Llama-2-13b-hf": "meta-llama/Llama-2-13b-hf",
"Llama-2-70b-hf": "meta-llama/Llama-2-70b-hf",
"open_llama_13b": "openlm-research/open_llama_13b",
"vicuna-13b-v1.3": "lmsys/vicuna-13b-v1.3",
"koala": "young-geng/koala",
"mpt-7b": "mosaicml/mpt-7b",
"mpt-7b-storywriter": "mosaicml/mpt-7b-storywriter",
"mpt-30b": "mosaicml/mpt-30b",
"opt-66b": "facebook/opt-66b",
"opt-iml-max-30b": "facebook/opt-iml-max-30b",

"Qwen-1_8B": "Qwen/Qwen-1_8B",
# "Qwen-1_8B-Chat": "Qwen/Qwen-1_8B-Chat",
"Qwen-1_8B-Chat-Int8": "Qwen/Qwen-1_8B-Chat-Int8",
"Qwen-1_8B-Chat-Int4": "Qwen/Qwen-1_8B-Chat-Int4",

"Qwen-7B": "Qwen/Qwen-7B",
"Qwen-7B-Chat": "Qwen/Qwen-7B-Chat",

"Qwen-14B": "Qwen/Qwen-14B",
"Qwen-14B-Chat": "Qwen/Qwen-14B-Chat",
"Qwen-14B-Chat-Int8": "Qwen/Qwen-14B-Chat-Int8",
"Qwen-14B-Chat-Int4": "Qwen/Qwen-14B-Chat-Int4",

"Qwen-72B": "Qwen/Qwen-72B",
"Qwen-72B-Chat": "Qwen/Qwen-72B-Chat",
"Qwen-72B-Chat-Int8": "Qwen/Qwen-72B-Chat-Int8",
"Qwen-72B-Chat-Int4": "Qwen/Qwen-72B-Chat-Int4",

"agentlm-7b": "THUDM/agentlm-7b",
"agentlm-13b": "THUDM/agentlm-13b",
"agentlm-70b": "THUDM/agentlm-70b",

}

你认为支持Agent能力的模型,可以在这里添加,添加后不会出现可视化界面的警告

经过我们测试,原生支持Agent的模型仅有以下几个

SUPPORT_AGENT_MODEL = [ "azure-api", "openai-api", "qwen-api", "Qwen", "chatglm3", "xinghuo-api", ] 要修改上面配置文件的哪些内容

dosubot[bot] commented 6 months ago

Hey there, @302658980! Long time no chat. Hope you're doing awesomely! 🚀

To replace or speed up your current "bge-large-zh" Embedding model located at "/root/model/bge-large-zh" in the MODEL_PATH dictionary, you can either update the path to a new model or choose a different model that might be faster or more efficient. Here's how you can do it based on the provided configuration file:

  1. Replace with a New Model:

    • If you have a new model that you believe is faster or more suitable, you can simply change the path of the "bge-large-zh" key in the MODEL_PATH dictionary to point to the new model's location. For example, if the new model is located at /new/model/path, update the dictionary as follows:
      "bge-large-zh": "/new/model/path",
    • Ensure the new model is compatible with the rest of your setup.
  2. Choose a Different Pre-configured Model:

    • If you want to use a different pre-configured model that is expected to be faster, you can change the EMBEDDING_MODEL variable to match the key of the desired model in the MODEL_PATH dictionary. For example, if you want to switch to "bge-base-zh-v1.5" which might be lighter and faster, update the EMBEDDING_MODEL as follows:
      EMBEDDING_MODEL = "bge-base-zh-v1.5"
    • This change assumes that "bge-base-zh-v1.5" is a suitable replacement for your needs. You might need to consider the trade-offs in model performance and speed.
  3. Configuration Adjustments:

    • MODEL_ROOT_PATH: If your new model is located in a directory that is a child of a common root directory, you can set MODEL_ROOT_PATH to this common root directory. The program will automatically detect and load the model if its directory name matches the EMBEDDING_MODEL or the keys/values in MODEL_PATH.
    • EMBEDDING_DEVICE: Ensure this is set to the optimal device for running your model. For example, if you have a GPU available, setting it to "cuda" might speed up the model's performance.

Remember, when making changes, ensure that the new model is compatible with your application's requirements and that you have tested the model's performance to ensure it meets your needs.

The compatible models that can be used as a replacement for the "bge-large-zh" Embedding model in the Langchain-Chatchat codebase, as per the provided context, include:

  1. "bge-small-zh": "BAAI/bge-small-zh"
  2. "bge-base-zh": "BAAI/bge-base-zh"
  3. "bge-large-zh-noinstruct": "BAAI/bge-large-zh-noinstruct"
  4. "bge-base-zh-v1.5": "BAAI/bge-base-zh-v1.5"
  5. "bge-large-zh-v1.5": "BAAI/bge-large-zh-v1.5"
  6. "bge-m3": "BAAI/bge-m3"

These models are listed under the "embed_model" section in the provided configuration file, indicating they are intended for embedding purposes similar to "bge-large-zh".

Unfortunately, I wasn't able to find specific performance differences between the "bge-large-zh" Embedding model and the other compatible models listed in the Langchain-Chatchat codebase within the repository. You might need to conduct some performance tests to determine which model best suits your needs.

For more details, you can refer to the model configuration example in the repository.

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
302658980 commented 6 months ago

嘿,@302658980!好久没有聊天了。希望你做得很棒!🚀

要替换或加速字典中位于“/root/model/bge-large-zh”的当前“bge-large-zh”嵌入模型MODEL_PATH,您可以更新新模型的路径或选择一个不同的模型可能会更快或更有效。以下是根据提供的配置文件执行此操作的方法:

  1. 更换新型号:

    • 如果您有一个您认为更快或更合适的新模型,您只需更改字典"bge-large-zh"中键的路径MODEL_PATH以指向新模型的位置即可。例如,如果新模型位于/new/model/path,则按如下方式更新字典:
      "bge-large-zh": "/new/model/path",
    • 确保新模型与您的其他设置兼容。
  2. 选择不同的预配置模型:

    • 如果您想使用预计速度更快的不同预配置模型,您可以更改变量EMBEDDING_MODEL以匹配字典中所需模型的键MODEL_PATH。例如,如果您想切换到"bge-base-zh-v1.5"更轻、更快的版本,请更新EMBEDDING_MODEL如下:
      EMBEDDING_MODEL = "bge-base-zh-v1.5"
    • 此更改假定它"bge-base-zh-v1.5"是满足您需求的合适替代品。您可能需要考虑模型性能和速度的权衡。
  3. 配置调整:

    • MODEL_ROOT_PATH:如果您的新模型位于公共根目录的子目录中,则可以设置MODEL_ROOT_PATH为该公共根目录。EMBEDDING_MODEL如果模型的目录名称与或 中的键/值匹配,程序将自动检测并加载模型MODEL_PATH
    • EMBEDDING_DEVICE:确保将其设置为运行模型的最佳设备。例如,如果您有可用的 GPU,将其设置为"cuda"可能会加快模型的性能。

请记住,进行更改时,请确保新模型与您的应用程序的要求兼容,并且您已经测试了模型的性能以确保它满足您的需求。

根据提供的上下文,可用作 Langchain-Chachat 代码库中“bge-large-zh”嵌入模型的替代的兼容模型包括:

  1. "bge-small-zh": "BAAI/bge-small-zh"
  2. "bge-base-zh": "BAAI/bge-base-zh"
  3. "bge-large-zh-noinstruct": "BAAI/bge-large-zh-noinstruct"
  4. "bge-base-zh-v1.5": "BAAI/bge-base-zh-v1.5"
  5. "bge-large-zh-v1.5": "BAAI/bge-large-zh-v1.5"
  6. "bge-m3": "BAAI/bge-m3"

这些模型列在提供的配置文件中的“embed_model”部分下,表明它们用于类似于“bge-large-zh”的嵌入目的。

不幸的是,我无法找到“bge-large-zh”嵌入模型与存储库中 Langchain-Chachat 代码库中列出的其他兼容模型之间的具体性能差异。您可能需要进行一些性能测试以确定哪种模型最适合您的需求。

更多详细信息,您可以参考存储库中的模型配置示例。

来源

关于多苏

此回复旨在有用并节省您的时间。它并不是一个精确的解决方案,而是您自己研究的起点。

帮助我变得更有用!如果这有帮助,请留下 👍;如果不相关,请留下 👎。

如果您想继续对话,请以以下方式开始回复@dosu-bot。

现在我有一个更快的embendding 链接是http://region-9.seetacloud.com:21375/embed 我该如何修改呢

dosubot[bot] commented 6 months ago

To replace the current "bge-large-zh" Embedding model located at "/root/model/bge-large-zh" in the MODEL_PATH dictionary with a new model available at "http://region-9.seetacloud.com:21375/embed", you need to update the MODEL_PATH dictionary in the configs/model_config.py.example file. Here's how you can do it:

MODEL_PATH["bge-large-zh"] = "http://region-9.seetacloud.com:21375/embed"

This line of code should be placed where the MODEL_PATH dictionary is initialized or updated in your application.

However, there are several performance considerations and compatibility issues you should be aware of:

  1. Network Latency: The new model is hosted online, which means every request to this model will involve network latency. This could significantly affect the performance if your application relies on real-time processing.

  2. Availability and Reliability: Depending on an external service introduces a point of failure. If the service hosting the new model goes down or experiences issues, your application will be directly affected.

  3. Data Privacy and Security: Sending data to an external service might raise privacy and security concerns, especially if the data is sensitive. Ensure the service complies with your data handling policies.

  4. Model Compatibility: Ensure the new model's input and output formats are compatible with your current system. Any difference in expected input format, output format, or even the model's performance characteristics might require adjustments in your code.

  5. Rate Limiting: External services often have rate limits. Ensure that your usage of the new model complies with any such limits to avoid service interruptions.

  6. Cost: Using an external service might incur costs based on the number of requests, data transfer, or other factors. It's important to understand and plan for these costs.

You can find more details about how the MODEL_PATH dictionary is used in the Langchain-Chatchat codebase in the configs/model_config.py.example file.

Please test the integration thoroughly in a development environment before deploying changes to production.

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
302658980 commented 6 months ago

将字典中位于“/root/model/bge-large-zh”的当前“bge-large-zh”嵌入模型替换MODEL_PATH为“ http://region-9.seetacloud.com:21375”上提供的新模型/ embed ",需要更新文件MODEL_PATH中的字典configs/model_config.py.example。您可以这么做:

MODEL_PATH["bge-large-zh"] = "http://region-9.seetacloud.com:21375/embed"

该行代码应放置MODEL_PATH在应用程序中初始化或更新字典的位置。

但是,您应该注意几个性能注意事项和兼容性问题:

  1. 网络延迟:新模型在线托管,这意味着模型的每个请求都会涉及网络延迟。如果您的应用程序依赖于实时处理,这可能会显着影响性能。
  2. 可用性和可靠性:依赖外部服务会引入故障点。如果托管新模型的服务出现故障或遇到问题,您的应用程序将直接受到影响。
  3. 数据隐私和安全:将数据发送到外部服务可能会引起隐私和安全问题,特别是在数据敏感的情况下。确保服务符合您的数据处理政策。
  4. 模型兼容性:确保新模型的输入和输出格式与您当前的系统兼容。预期输入格式、输出格式甚至模型性能特征的任何差异都可能需要对代码进行调整。
  5. 限制:外部服务有速率限制。确保您对新模型的使用符合任何此类速率限制,小区服务通常会中断。
  6. 成本:使用外部服务可能会产生基于请求数量、数据传输或其他因素的成本。了解并计划这些成本非常重要。

您可以在configs/model_config.py.exampleMODEL_PATH文件中找到有关如何在Langchain-Chachat代码库中使用字典的更多详细信息。

在将更改部署到生产之前,请在开发环境中彻底测试集成。

来源

关于多苏

在 model_config 里加上 MODEL_PATH["bge-large-zh"] = " http://region-9.seetacloud.com:21375/embed " 不太对的样子

github-actions[bot] commented 5 months ago

这个问题已经被标记为 stale ,因为它已经超过 30 天没有任何活动。