GolfHotelSierra commented 5 months ago

问题描述

需要部署internlm2-7b而内置模型中仅有intern2-chat，因此选择了自定义部署
在部署成功后（部署流程见下文），对模型进行问答测试，因为需要结合 LangChain 使用，所以使用了 langchain_community.llms.xinference 下的 Xinference

使用了如下的测试代码，


def test_xinference(server_url: str, model_uid: str):
llm = Xinference(server_url=server_url, model_uid=model_uid, temperature=0, stream=False, stop=['\n'])
for _ in range(5):
    response = llm.invoke(input="你好，你是谁？")
    print(response.strip())
    print('--------')
    response = llm.invoke(input="Hello, who're you?")
    print(response.strip())
    print("===============")
print("test pass!!!")

if name == 'main':

test_xinference

server_url = 'local_address'
model_uid = 'internlm2-7b'
test_xinference(server_url=server_url, model_uid=model_uid)


### 测试结果
1. 在多次测试中，模型在多个 `temperature` 设置下，都未能给出正常的回答
- temperature = 0

对中文给出了空回答

对英文给出的回答不太符合一般模型的介绍

I'm a new member here. :D

- temperature = 0.8

对中文给出了空回答

对英文给出的回答与问题相关度很低

I'm [b]usernameyourself[/b]. Pleased to meet ya!


### 预期的输出
1. 通过 `transformers` 库部署后的输出如下，

中文回答

你好，我是 InternLM (书生·浦语)，是上海人工智能实验室开发的一款语言模型。我可以理解并回答你的问题。

英文回答

Hi, I'm InternLM (书生·浦语), a conversational language model developed by Shanghai AI Laboratory (上海人工智能实验室). How can I assist you today?


### 部署环境

langchain 0.1.13 langchain-community 0.0.29 langchain-core 0.1.33 xinference 0.9.4


### 部署过程
1. 需要部署internlm2-7b而内置模型中仅有intern2-chat，因此选择了自定义部署
2. 使用 `.json` 文件进行部署，文件内容如下，模型已经下载到本地，
```json
{
  "version": 1,
  "context_length": 8192,
  "model_name": "internlm2-7b",
  "model_lang": [
    "en",
    "zh"
  ],
  "model_ability": [
    "generate"
  ],
  "model_family": "other",
  "model_specs": [
    {
      "model_format": "pytorch",
      "model_size_in_billions": 7,
      "quantizations": [
        "none"
      ],
      "model_id": "Shanghai_AI_Laboratory/internlm2-7b",
      "model_uri": "local_model_path"
    }
  ]
}

使用的注册命令如下，

xinference register --model-type LLM --file internlm2_7b.json --persist --endpoint local_address

使用以下的指令进行部署，部署成功，无报错，

xinference launch -n internlm2-7b -s 7 -f pytorch -q none -e local_address

疑问

请问这样的结果是否是 LangChain 中的 Xinference 有一些不同的参数设置，或是版本问题？或者可能在自定义部署中有一些参数设置不正确？

qinxuye commented 5 months ago

直接用 internlm2-chat 试下，应该就是你要的。

GolfHotelSierra commented 5 months ago

直接用 internlm2-chat 试下，应该就是你要的。

谢谢您的回答，internlm2-chat 确实也满足需求了不过我又测试了一些微调后的模型，同样参数设置下，xinference的推理结果也会出现比较明显的下降因为我也想用xinference部署一些微调后的模型然后使用langchain进行开发，所以可能还是需要解决这个问题，您知道这个问题可能是什么造成的吗？

qinxuye commented 5 months ago

model family 那里写 internlm2-chat 试下

GolfHotelSierra commented 5 months ago

model family 那里写 internlm2-chat 试下

我将 model_family 修改为了 "model_family": "internlm2-chat"，并且 model_ability 修改为了 "model_ability": ["generate", "chat"]，但是推理结果并没有发生变化，推理结果也还是会低于 transformers 库部署后推理的结果

buptzyf commented 3 months ago

我发现langchain的xinference就没有实现chat，都是generate，所以对话能力很弱

github-actions[bot] commented 2 weeks ago

This issue is stale because it has been open for 7 days with no activity.

github-actions[bot] commented 2 weeks ago

This issue was closed because it has been inactive for 5 days since being marked as stale.

xorbitsai / inference

成功自定义部署本地模型，但在使用LangChain中的Xinference工具时，模型无法给出正常的回答 #1211

问题描述

test_xinference

对中文给出了空回答

对英文给出的回答不太符合一般模型的介绍

对中文给出了空回答

对英文给出的回答与问题相关度很低

中文回答

英文回答

疑问