Openai style api for open large language models, using LLMs just as chatgpt! Support for LLaMA, LLaMA-2, BLOOM, Falcon, Baichuan, Qwen, Xverse, SqlCoder, CodeLLaMA, ChatGLM, ChatGLM2, ChatGLM3 etc. 开源大模型的统一后端接口
提交前必须检查以下项目 | The following items must be checked before submission
[X] 请确保使用的是仓库最新代码(git pull),一些问题已被解决和修复。 | Make sure you are using the latest code from the repository (git pull), some issues have already been addressed and fixed.
[X] 我已阅读项目文档和FAQ章节并且已在Issue中对问题进行了搜索,没有找到相似问题和解决方案 | I have searched the existing issues / discussions
问题类型 | Type of problem
其他问题 | Other issues
操作系统 | Operating system
Linux
详细描述问题 | Detailed description of the problem
# 请在此处粘贴运行代码(如没有可删除该代码块)
# Paste the runtime code here (delete the code block if you don't have it)
# api/generation/qwen.py
def build_qwen_chat_input(
tokenizer: PreTrainedTokenizer,
messages: List[ChatMessage],
context_len: int = 8192,
max_new_tokens: int = 256,
functions: List[dict] = None,
) -> List[int]:
""" https://huggingface.co/Qwen/Qwen-7B-Chat/blob/main/qwen_generation_utils.py """
query, history = process_qwen_messages(messages, functions)
for q, r in history:
messages.extend([ChatMessage(role=Role.USER, content=q), ChatMessage(role=Role.ASSISTANT, content=r)])
messages.append(ChatMessage(role=Role.USER, content=query))
for i, x in enumerate(messages):
print('messages', i, x.role, x.content)
max_input_tokens = context_len - max_new_tokens
system, rounds = parse_messages(messages)
system = "You are a helpful assistant." + system # fix system prompt
提交前必须检查以下项目 | The following items must be checked before submission
问题类型 | Type of problem
其他问题 | Other issues
操作系统 | Operating system
Linux
详细描述问题 | Detailed description of the problem
为什么已经输入messages以后,还要执行
此时messages已经包含用户输入的问题了,但query也是用户输入的问题,这样的话不会导致llm的输入变成下面这样吗? system user_input user_input
Dependencies
No response
运行日志或截图 | Runtime logs or screenshots