xusenlinzy / api-for-open-llm

Openai style api for open large language models, using LLMs just as chatgpt! Support for LLaMA, LLaMA-2, BLOOM, Falcon, Baichuan, Qwen, Xverse, SqlCoder, CodeLLaMA, ChatGLM, ChatGLM2, ChatGLM3 etc. 开源大模型的统一后端接口
Apache License 2.0
2.31k stars 262 forks source link

求助 非chat的baichuan1 7b怎么运行 #166

Closed lehug closed 10 months ago

lehug commented 10 months ago

提交前必须检查以下项目 | The following items must be checked before submission

问题类型 | Type of problem

模型推理和部署 | Model inference and deployment

操作系统 | Operating system

Linux

详细描述问题 | Detailed description of the problem

我按照提示,配置的env如下:

# model related
MODEL_NAME=baichuan-7b
MODEL_PATH=/home/server/LLM/models/HuatuoGPT-7B
ADAPTER_MODEL_PATH=/home/server/LLM/models/firefly-baichuan-7b-qlora-sft

其中HuatuoGPT-7B 是我在https://github.com/FreedomIntelligence/HuatuoGPT 这个项目下载的,我注意到这个是基于baichun不是baichuan2的。所以我理解,我上面的配置是正确的。

但是我运行时,启动不了,报错内容如下

  File "/home/server/miniconda3/lib/python3.11/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/home/server/miniconda3/lib/python3.11/site-packages/starlette/routing.py", line 69, in app
    await response(scope, receive, send)
  File "/home/server/miniconda3/lib/python3.11/site-packages/starlette/responses.py", line 270, in __call__
    async with anyio.create_task_group() as task_group:
  File "/home/server/miniconda3/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 597, in __aexit__
    raise exceptions[0]
  File "/home/server/miniconda3/lib/python3.11/site-packages/starlette/responses.py", line 273, in wrap
    await func()
  File "/home/server/miniconda3/lib/python3.11/site-packages/starlette/responses.py", line 262, in stream_response
    async for chunk in self.body_iterator:
  File "/home/server/LLM/codes/api-for-open-llm-master/api/routes/chat.py", line 131, in chat_completion_stream_generator
    for content in GENERATE_MDDEL.generate_stream_gate(gen_params):
  File "/home/server/LLM/codes/api-for-open-llm-master/api/generation/core.py", line 94, in generate_stream_gate
    yield from self.generate_stream_gate_v1(params)
  File "/home/server/LLM/codes/api-for-open-llm-master/api/generation/core.py", line 98, in generate_stream_gate_v1
    params["prompt"] = self.generate_prompt(params["prompt"])
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/server/LLM/codes/api-for-open-llm-master/api/generation/core.py", line 88, in generate_prompt
    return self.prompt_adapter.apply_chat_template(messages) if self.construct_prompt else messages
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/server/LLM/codes/api-for-open-llm-master/api/apapter/template.py", line 53, in apply_chat_template
    compiled_template = _compile_jinja_template(self.template)
                                                ^^^^^^^^^^^^^
  File "/home/server/LLM/codes/api-for-open-llm-master/api/apapter/template.py", line 63, in template
    raise NotImplementedError
NotImplementedError

我尝试在template.py的line 537中加上了 baichuan7b,但是这样似乎是基于推理的,并不能正常对话。对话时会不断输出很多内容。

请问我该怎么操作来运行非chat的baichuan。另外示例里面是不是对baichuan的启动项需要补充点更多步骤,谢谢

Dependencies

# 请在此处粘贴依赖情况
# Please paste the dependencies here

运行日志或截图 | Runtime logs or screenshots

# 请在此处粘贴运行日志
# Please paste the run log here
xusenlinzy commented 10 months ago

请问非chat是指你微调之后的模型是不具备对话能力吗,还是说有特定的对话模板 如果你使用的是firefly的模板,你可以试一下指定环境变量PROMPT_NAME=firefly

lehug commented 10 months ago

请问非chat是指你微调之后的模型是不具备对话能力吗,还是说有特定的对话模板 如果你使用的是firefly的模板,你可以试一下指定环境变量PROMPT_NAME=firefly

这个模型我不太懂~ 我就是用的上述的医疗方面微调后的,别人开放的模型。 我刚尝试了PROMPT_NAME=firefly这样不会提示NotImplementedError,但是对话内容是胡乱回复的。比如

{'model': '/home/server/LLM/models/carellm', 'frequency_penalty': 0.0, 'function_call': None, 'functions': None, 'logit_bias': None, 'max_tokens': 1024, 'n': 1, 'presence_penalty': 0.0, 'response_format': None, 'seed': None, 'stop': [], 'temperature': 0.3, 'tool_choice': None, 'tools': None, 'top_p': 1.0, 'user': None, 'stream': False, 'prompt': [{'content': '你好', 'role': 'user'}, {'content': '如何评价电影《肖申克的救赎》(The Shawshank Redemption)?\n《肖申克的救赎》是一部经典的电影,它讲述了一位银行家安迪被错误判入肖申克监狱的故事。在监狱里,安迪遇到了一位囚犯艾利斯,两人成为了朋友。经过多年的努力,安迪终于成功越狱,并最终获得了自由。\n这部电影以其深刻的主题、真挚的情感和出色的表演赢得了观众的uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu', 'role': 'assistant'}, {'content': '使用四到五个字直接返回这句话的简要主题,不要解释、不要标点、不要语气词、不要多余文本,如果没有主题,请直接返回“闲聊”', 'role': 'user'}], 'echo': False, 'stop_token_ids': []}

我发送了你好,他回复了一大堆乱七八糟的。

我是完全按照 https://github.com/xusenlinzy/api-for-open-llm/blob/master/docs/SCRIPT.md#baichuan-7b 这里面操作的,看着这上面写的也是非chat的模型,为啥预期不一致呢,求助啦

xusenlinzy commented 10 months ago

如果你使用这个项目 https://github.com/FreedomIntelligence/HuatuoGPT 的模型的话,拉取刚刚更新的代码,试试下面的配置

MODEL_NAME=baichuan-7b
MODEL_PATH=/home/server/LLM/models/HuatuoGPT-7B
PROMPT_NAME=huatuo

ADAPTER_MODEL_PATH这个参数是不需要的

lehug commented 10 months ago

谢谢,试了下,可以的了。 再请教下,那类比 https://github.com/SCIR-HI/Huatuo-Llama-Med-Chinese 这个项目下的模型,有基于“活字”的,而活字基于“BLOOM”, 那这样的项目怎么基于咱这边跑起来呢? 这个特殊点,他只开放了lora权重,是需要先merge,然后再运行?运行的话也是基于bloom,然后单独写个template?

xusenlinzy commented 10 months ago

不merge也行,指定ADAPTER_MODEL_PATH为lora权重路径,MODEL_NAME=bloom,PROMPT_NAME指定为你根据不同模型新增的模板