qwen工具调用，回复被异常截断

JinCheng666 commented 1 month ago

例行检查

[x] 我已确认目前没有类似 issue
[x] 我已完整查看过项目 README，以及项目文档
[x] 我使用了自己的 key，并确认我的 key 是可正常使用的
[x] 我理解并愿意跟进此 issue，协助测试和提供反馈
[x] 我理解并认可上述内容，并理解项目维护者精力有限，不遵循规则的 issue 可能会被无视或直接关闭

你的版本 v4.8.2

[ ] 公有云版本
[x] 私有部署版本, 具体版本号:

问题描述, 日志截图 使用xinference（0.11.1）框架，vllm加速推理（0.4.1），部署本地Qwen1.5-72B-Chat-GPTQ-Int4模型，接入oneapi供fastgpt调用。对话功能都正常，但工具调用时，长回复会出现被异常截断的问题。正常和异常调用的log，看不出区别，异常的调用，大模型也做出完整的回复了，但fastgpt只显示了回复的最后几个token出来。

麻烦看下该如何排查这个问题。

异常工具调用的xinference日志

2024-06-03 11:01:27,675 xinference.core.supervisor 20303 DEBUG    Enter get_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7f1ec561cf40>, 'qwen:72b'), kwargs: {}
2024-06-03 11:01:27,675 xinference.core.worker 20303 DEBUG    Enter get_model, args: (<xinference.core.worker.WorkerActor object at 0x7f203e884130>,), kwargs: {'model_uid': 'qwen:72b-1-0'}
2024-06-03 11:01:27,675 xinference.core.worker 20303 DEBUG    Leave get_model, elapsed time: 0 s
2024-06-03 11:01:27,675 xinference.core.supervisor 20303 DEBUG    Leave get_model, elapsed time: 0 s
2024-06-03 11:01:27,676 xinference.core.supervisor 20303 DEBUG    Enter describe_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7f1ec561cf40>, 'qwen:72b'), kwargs: {}
2024-06-03 11:01:27,676 xinference.core.worker 20303 DEBUG    Enter describe_model, args: (<xinference.core.worker.WorkerActor object at 0x7f203e884130>,), kwargs: {'model_uid': 'qwen:72b-1-0'}
2024-06-03 11:01:27,676 xinference.core.worker 20303 DEBUG    Leave describe_model, elapsed time: 0 s
2024-06-03 11:01:27,676 xinference.core.supervisor 20303 DEBUG    Leave describe_model, elapsed time: 0 s
2024-06-03 11:01:27,678 xinference.core.model 824361 DEBUG    Enter wrapped_func, args: (<xinference.core.model.ModelActor object at 0x7f2a1b304770>, '查询大语言模型最新新闻', '你的任务是：根据问题生成搜索关键词，调用Bing网络查询。\n在调用Bing网络查询时，请确保生成的搜索词准确反映了问题的核心内容。在回答问题时，要充分利用知识库和网络查询的结果，以确保回答的准确性和完整性。\n\n\n\r', [], {'temperature': 0.0, 'tool_choice': 'auto', 'tools': <list_iterator object at 0x7f1f141e4970>, 'stream': True}), kwargs: {}
2024-06-03 11:01:27,679 xinference.core.model 824361 DEBUG    Request chat, current serve request count: 0, request limit: None for the model qwen:72b
2024-06-03 11:01:27,679 xinference.model.llm.vllm.core 824361 DEBUG    Enter generate, prompt: <|im_start|>system
你的任务是：根据问题生成搜索关键词，调用Bing网络查询。
在调用Bing网络查询时，请确保生成的搜索词准确反映了问题的核心内容。在回答问题时，要充分利用知识库和网络查询的结果，以确保回答的准确性和完整性。

<|im_end|>
<|im_start|>user
Answer the following questions as best you can. You have access to the following APIs:

vuAvFH: Call this tool to interact with the vuAvFH API. What is the vuAvFH API useful for? 获取用户当前时区的时间。 Parameters: [] Format the arguments as a JSON object.

qtzKg6: Call this tool to interact with the qtzKg6 API. What is the qtzKg6 API useful for? Performs a Bing search for the given search key. Parameters: [{"name": "searchKey", "type": "string", "description": "The search term to use in the Bing search."}] Format the arguments as a JSON object.

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [vuAvFH, qtzKg6]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: 查询大语言模型最新新闻<|im_end|>
<|im_start|>assistant
, generate config: {'temperature': 0.0, 'tool_choice': 'auto', 'stream': True, 'stop': ['<|endoftext|>', '<|im_start|>', '<|im_end|>', 'Observation:'], 'stop_token_ids': [151643, 151644, 151645]}
2024-06-03 11:01:27,679 xinference.core.model 824361 DEBUG    After request chat, current serve request count: 0 for the model qwen:72b
2024-06-03 11:01:27,679 xinference.core.model 824361 DEBUG    Leave wrapped_func, elapsed time: 0 s
2024-06-03 11:01:29,778 xinference.model.llm.utils 824361 DEBUG    Tool call content: 我需要使用Bing搜索来查找大语言模型的最新新闻。, func: qtzKg6, args: {'searchKey': '大语言模型 最新新闻'}
2024-06-03 11:01:42,312 xinference.core.supervisor 20303 DEBUG    Enter get_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7f1ec561cf40>, 'qwen:72b'), kwargs: {}
2024-06-03 11:01:42,312 xinference.core.worker 20303 DEBUG    Enter get_model, args: (<xinference.core.worker.WorkerActor object at 0x7f203e884130>,), kwargs: {'model_uid': 'qwen:72b-1-0'}
2024-06-03 11:01:42,313 xinference.core.worker 20303 DEBUG    Leave get_model, elapsed time: 0 s
2024-06-03 11:01:42,313 xinference.core.supervisor 20303 DEBUG    Leave get_model, elapsed time: 0 s
2024-06-03 11:01:42,313 xinference.core.supervisor 20303 DEBUG    Enter describe_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7f1ec561cf40>, 'qwen:72b'), kwargs: {}
2024-06-03 11:01:42,314 xinference.core.worker 20303 DEBUG    Enter describe_model, args: (<xinference.core.worker.WorkerActor object at 0x7f203e884130>,), kwargs: {'model_uid': 'qwen:72b-1-0'}
2024-06-03 11:01:42,314 xinference.core.worker 20303 DEBUG    Leave describe_model, elapsed time: 0 s
2024-06-03 11:01:42,314 xinference.core.supervisor 20303 DEBUG    Leave describe_model, elapsed time: 0 s
2024-06-03 11:01:42,316 xinference.core.model 824361 DEBUG    Enter wrapped_func, args: (<xinference.core.model.ModelActor object at 0x7f2a1b304770>, '<TOOL>', '你的任务是：根据问题生成搜索关键词，调用Bing网络查询。\n在调用Bing网络查询时，请确保生成的搜索词准确反映了问题的核心内容。在回答问题时，要充分利用知识库和网络查询的结果，以确保回答的准确性和完整性。\n\n\n\r', [{'role': 'user', 'content': '查询大语言模型最新新闻'}, {'role': 'assistant', 'tool_calls': [{'id': '046bab40-afbe-4ef9-84d3-ffc6e7781943', 'type': 'function', 'function': {'name': 'qtzKg6', 'arguments': '{"searchKey": "大语言模型 最新新闻"}'}}]}, {'tool_call_id': '046bab40-afbe-4ef9-84d3-ffc6e7781943', 'role': 'tool', 'name': 'qtzKg6', 'content': '{\n  "result": "{\\"prompt\\":\\"The below set forth the Bing search results，you can use this realtime info，answer user\'s question。the Searchkey: 大语言模型 最新新闻; SearchResult:\\\\n\\\\n- Title：[大模型年度榜单公布：GPT-4第一，智谱、阿里紧追 - 澎湃新闻](https://www.thepaper.cn/newsDetail_forward_26209606)\\\\n  snippet：对过去一年来主流大模型全面评测诊断后，结果显示，GPT-4-Turbo在各项评测中均获最佳表现，国内厂商近期发布的模型紧随其后，包括智谱清言GLM-4、阿里巴巴Qwen-Max、百度文心一言4.0。 评测是大模型的指挥棒和指南针，OpenCompass为模型提供评测服务，量化模型在知识、语言、理解、推理和考试等五大能力维度的表现。 总体来看，大语言模型整体能力仍有较大提升空间，复杂推理相关能力仍是大模型普遍面临的难题，国内大模型相比于GPT-4还存在差距。 中文场景下国内最新大模型已展现出优势，在部分维度上接近GPT-4-Turbo的水平。 中英双语客观评测：数学、代码仍是短板.\\\\n  Is Navigational Page?：Yes\\\\n\\\\n- Title：[大型语言模型综述全新出炉：从T5到GPT-4最全盘点，国内 ...](https://www.thepaper.cn/newsDetail_forward_22557232)\\\\n  snippet：2023-04-04 11:58. 来源：澎湃新闻·澎湃号·湃客. 字号. 机器之心报道. 机器之心编辑部. 为什么仿佛一夜之间，自然语言处理（NLP）领域就突然突飞猛进，摸到了通用人工智能的门槛？ 如今的大语言模型（LLM）发展到了什么程度？ 未来短时间内，AGI 的发展路线又将如何？ 自 20 世纪 50 年代图灵测试提出以来，人们始终在探索机器处理语言智能的能力。 语言本质上是一个错综复杂的人类表达系统，受到语法规则的约束。 因此，开发能够理解和精通语言的强大 AI 算法面临着巨大挑战。 过去二十年，语言建模方法被广泛用于语言理解和生成，包括统计语言模型和神经语言模型。\\\\n  Is Navigational Page?：No\\\\n\\\\n- Title：[澎湃新闻 - 开源大模型Llama 3王者归来：最大底牌4000亿 ...](https://www.thepaper.cn/newsDetail_forward_27106328)\\\\n  snippet：本周四，AI 领域迎来重大消息，Meta 正式发布了人们等待已久的开源大模型 Llama 3。 扎克伯格在 Facebook 上发帖：Big AI news today. 与此同时，扎克伯格宣布：基于最新的 Llama 3 模型，Meta 的 AI 助手现在已经覆盖 Instagram、WhatsApp、Facebook 等全系应用，并单独开启了网站。 另外还有一个图像生成器，可根据自然语言提示词生成图片。 Meta AI 网址：https://www.meta.ai/ 同日，亚马逊云科技也宣布，Meta Llama 3 基础模型已可通过 Amazon SageMaker JumpStart来部署和推理运行。\\\\n  Is Navigational Page?：No\\\\n\\\\n- Title：[中文能力比肩GPT-4，国产大模型GLM-4上线-清华大学](https://www.tsinghua.edu.cn/info/1182/109397.htm)\\\\n  snippet：市科委、中关村管委会主任张继红说，智谱AI的大模型是本市首批完成《生成式人工智能服务管理暂行办法》备案并上线的大模型之一，该系列模型的开源版全球下载量超1000万，是目前下载量和开源影响力最高的国产大模型。 据了解，GLM-4可以支持更长的上下文，同时推理速度更快，大大降低推理成本。...\\\\n  Is Navigational Page?：No\\\\n\\\\n- Title：[大模型综述11月最新升级](http://ai.ruc.edu.cn/research/science/20231131002.html)\\\\n  snippet：今年3月末，我们在arXiv网站发布了大语言模型综述文章《A Survey of Large Language Models》的第一个版本V1，该综述文章旨在系统地梳理大语言模型的研究进展与核心技术，讨论了大量的相关工作。 自大语言模型综述的预印本上线以来，受到了不少读者的关注，我们在努力推动该综述文章的持续扩充与修订。...\\\\n  Is Navigational Page?：No\\\\n\\\\n- Title：[大语言模型综述 - Renmin University of China](http://ai.ruc.edu.cn/research/science/20230605100.html)\\\\n  snippet：最近，作为代表性的大语言模型应用ChatGPT展现出了超强的人机对话能力和任务求解能力，对于整个AI研究社区带来了重大影响。...\\\\n  Is Navigational Page?：No\\\\n\\\\n- Title：[国内首个知识图谱融合大模型平台推出 为大语言模型工业化 ...](https://news.cctv.com/2023/09/09/ARTIv0A1MbGqp4tzHhEDx5sz230909.shtml)\\\\n  snippet：今年以来，以ChatGPT为代表的大语言模型和生成式人工智能成为全球科技热点，并影响到人类的生活和生产方式。 不过全球用户也很快发现，在与大语言模型交互的过程中，会碰到它“一本正经地胡说八道”，输出似是而非甚至荒谬的结果，这种被称作“大模型幻觉”的技术特点阻碍了它在企业和行业的应用与发展。\\\\n  Is Navigational Page?：No\\\\n\\\\n- Title：[赶超GPT-4！谷歌发布最新大模型Gemini，主打三大“杀手锏 ...](https://new.qq.com/rain/a/20231207A00LFY00)\\\\n  snippet：外界期待已久的谷歌大语言模型Gemini在美国时间12月6日早间正式对外发布，谷歌首席执行官皮查伊表示，Gemini 1.0是目前为止谷歌能力最强的通用人工智能模型。 “Gemini是原生多模态打造，是（谷歌）通往Gemini模型时代的第一步。 ”皮查伊在当天的声明中说。 谷歌当天发布的Gemini 1.0共分为Ultra, Pro和Nano三个版本，其中Ultra的能力最强，复杂度最高，能够处理最为困难的任务，Pro能力稍弱，可以用来处理多任务，Nano则更注重于端侧的处理能力。\\\\n  Is Navigational Page?：No\\\\n\\\\n- Title：[清华 14 大 LLM 最新评测报告出炉：GPT-4 和 Claude-3 ...](https://www.ithome.com/0/763/013.htm)\\\\n  snippet：自大语言模型诞生之初，评测便成为大模型研究中不可或缺的一部分。 随着大模型研究的发展，对其性能重点的研究也在不断迁移。 根据我们的研究，大模型能力评测大概经历如下 5 个阶段： 2018 年-2021 年：语义评测阶段. 早期的语言模型主要关注自然语言的理解任务（ e.g. 分词、词性标注、句法分析、信息抽取)，相关评测主要考察语言模型对自然语言的语义理解能力。 代表工作：BERT、GPT、T5 等。 2021 年-2023 年：代码评测阶段. 随着语言模型能力的增强，更具应用价值的代码模型逐渐出现。 研究人员发现，基于代码生成任务训练的模型在测试中展现出更强的逻辑推理能力，代码模型成为研究热点。 代表工作：Codex、CodeLLaMa、CodeGeeX 等。\\\\n  Is Navigational Page?：No\\\\n\\\\n- Title：[ChatGPT浪潮下，看中国大语言模型产业发展 - 澎湃新闻](https://www.thepaper.cn/newsDetail_forward_26347873)\\\\n  snippet：大语言模型丨研究报告. 导语： ChatGPT这一现象级突围产品的横空出世，拉开了大语言模型产业和生成式AI（AIGC）产业蓬勃发展的序幕。 海外市场，OpenAI、微软、谷歌、Meta等巨头动作频频。 中国市场也百花齐放：百度、阿里、华为、腾讯、360、商汤、京东、科大讯飞、字节跳动等巨头厂商结合自身业务及战略布局，陆续宣布研发或已发布大语言模型产品；垂直赛道及大模型解决方案厂商则锚定一个或多个行业领域，意图打造“数据飞轮”护城河；应用层厂商则积极试水整合大模型能力，提升产品功能；众多科技大佬也宣布进军大模型领域进行创业。 市场热度高涨，中国人工智能产业迎来了难得的发展契机。 在热潮背后，产业的可持续发展，各类参与者的机会和价值点值得深思。\\\\n  Is Navigational Page?：No\\"}"\n}'}], {'temperature': 0.0, 'tool_choice': 'auto', 'tools': <list_iterator object at 0x7f1f141c12a0>, 'stream': True}), kwargs: {}
2024-06-03 11:01:42,316 xinference.core.model 824361 DEBUG    Request chat, current serve request count: 0, request limit: None for the model qwen:72b
2024-06-03 11:01:42,317 xinference.model.llm.vllm.core 824361 DEBUG    Enter generate, prompt: <|im_start|>system
你的任务是：根据问题生成搜索关键词，调用Bing网络查询。
在调用Bing网络查询时，请确保生成的搜索词准确反映了问题的核心内容。在回答问题时，要充分利用知识库和网络查询的结果，以确保回答的准确性和完整性。

<|im_end|>
<|im_start|>user
Answer the following questions as best you can. You have access to the following APIs:

vuAvFH: Call this tool to interact with the vuAvFH API. What is the vuAvFH API useful for? 获取用户当前时区的时间。 Parameters: [] Format the arguments as a JSON object.

qtzKg6: Call this tool to interact with the qtzKg6 API. What is the qtzKg6 API useful for? Performs a Bing search for the given search key. Parameters: [{"name": "searchKey", "type": "string", "description": "The search term to use in the Bing search."}] Format the arguments as a JSON object.

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [vuAvFH, qtzKg6]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: 查询大语言模型最新新闻<|im_end|>
<|im_start|>assistant
Thought: I can use qtzKg6.
Action: qtzKg6
Action Input: {"searchKey": "大语言模型 最新新闻"}<|im_end|>
<|im_start|>function
Observation: {
  "result": "{\"prompt\":\"The below set forth the Bing search results，you can use this realtime info，answer user's question。the Searchkey: 大语言模型 最新新闻; SearchResult:\\n\\n- Title：[大模型年度榜单公布：GPT-4第一，智谱、阿里紧追 - 澎湃新闻](https://www.thepaper.cn/newsDetail_forward_26209606)\\n  snippet：对过去一年来主流大模型全面评测诊断后，结果显示，GPT-4-Turbo在各项评测中均获最佳表现，国内厂商近期发布的模型紧随其后，包括智谱清言GLM-4、阿里巴巴Qwen-Max、百度文心一言4.0。 评测是大模型的指挥棒和指南针，OpenCompass为模型提供评测服务，量化模型在知识、语言、理解、推理和考试等五大能力维度的表现。 总体来看，大语言模型整体能力仍有较大提升空间，复杂推理相关能力仍是大模型普遍面临的难题，国内大模型相比于GPT-4还存在差距。 中文场景下国内最新大模型已展现出优势，在部分维度上接近GPT-4-Turbo的水平。 中英双语客观评测：数学、代码仍是短板.\\n  Is Navigational Page?：Yes\\n\\n- Title：[大型语言模型综述全新出炉：从T5到GPT-4最全盘点，国内 ...](https://www.thepaper.cn/newsDetail_forward_22557232)\\n  snippet：2023-04-04 11:58. 来源：澎湃新闻·澎湃号·湃客. 字号. 机器之心报道. 机器之心编辑部. 为什么仿佛一夜之间，自然语言处理（NLP）领域就突然突飞猛进，摸到了通用人工智能的门槛？ 如今的大语言模型（LLM）发展到了什么程度？ 未来短时间内，AGI 的发展路线又将如何？ 自 20 世纪 50 年代图灵测试提出以来，人们始终在探索机器处理语言智能的能力。 语言本质上是一个错综复杂的人类表达系统，受到语法规则的约束。 因此，开发能够理解和精通语言的强大 AI 算法面临着巨大挑战。 过去二十年，语言建模方法被广泛用于语言理解和生成，包括统计语言模型和神经语言模型。\\n  Is Navigational Page?：No\\n\\n- Title：[澎湃新闻 - 开源大模型Llama 3王者归来：最大底牌4000亿 ...](https://www.thepaper.cn/newsDetail_forward_27106328)\\n  snippet：本周四，AI 领域迎来重大消息，Meta 正式发布了人们等待已久的开源大模型 Llama 3。 扎克伯格在 Facebook 上发帖：Big AI news today. 与此同时，扎克伯格宣布：基于最新的 Llama 3 模型，Meta 的 AI 助手现在已经覆盖 Instagram、WhatsApp、Facebook 等全系应用，并单独开启了网站。 另外还有一个图像生成器，可根据自然语言提示词生成图片。 Meta AI 网址：https://www.meta.ai/ 同日，亚马逊云科技也宣布，Meta Llama 3 基础模型已可通过 Amazon SageMaker JumpStart来部署和推理运行。\\n  Is Navigational Page?：No\\n\\n- Title：[中文能力比肩GPT-4，国产大模型GLM-4上线-清华大学](https://www.tsinghua.edu.cn/info/1182/109397.htm)\\n  snippet：市科委、中关村管委会主任张继红说，智谱AI的大模型是本市首批完成《生成式人工智能服务管理暂行办法》备案并上线的大模型之一，该系列模型的开源版全球下载量超1000万，是目前下载量和开源影响力最高的国产大模型。 据了解，GLM-4可以支持更长的上下文，同时推理速度更快，大大降低推理成本。...\\n  Is Navigational Page?：No\\n\\n- Title：[大模型综述11月最新升级](http://ai.ruc.edu.cn/research/science/20231131002.html)\\n  snippet：今年3月末，我们在arXiv网站发布了大语言模型综述文章《A Survey of Large Language Models》的第一个版本V1，该综述文章旨在系统地梳理大语言模型的研究进展与核心技术，讨论了大量的相关工作。 自大语言模型综述的预印本上线以来，受到了不少读者的关注，我们在努力推动该综述文章的持续扩充与修订。...\\n  Is Navigational Page?：No\\n\\n- Title：[大语言模型综述 - Renmin University of China](http://ai.ruc.edu.cn/research/science/20230605100.html)\\n  snippet：最近，作为代表性的大语言模型应用ChatGPT展现出了超强的人机对话能力和任务求解能力，对于整个AI研究社区带来了重大影响。...\\n  Is Navigational Page?：No\\n\\n- Title：[国内首个知识图谱融合大模型平台推出 为大语言模型工业化 ...](https://news.cctv.com/2023/09/09/ARTIv0A1MbGqp4tzHhEDx5sz230909.shtml)\\n  snippet：今年以来，以ChatGPT为代表的大语言模型和生成式人工智能成为全球科技热点，并影响到人类的生活和生产方式。 不过全球用户也很快发现，在与大语言模型交互的过程中，会碰到它“一本正经地胡说八道”，输出似是而非甚至荒谬的结果，这种被称作“大模型幻觉”的技术特点阻碍了它在企业和行业的应用与发展。\\n  Is Navigational Page?：No\\n\\n- Title：[赶超GPT-4！谷歌发布最新大模型Gemini，主打三大“杀手锏 ...](https://new.qq.com/rain/a/20231207A00LFY00)\\n  snippet：外界期待已久的谷歌大语言模型Gemini在美国时间12月6日早间正式对外发布，谷歌首席执行官皮查伊表示，Gemini 1.0是目前为止谷歌能力最强的通用人工智能模型。 “Gemini是原生多模态打造，是（谷歌）通往Gemini模型时代的第一步。 ”皮查伊在当天的声明中说。 谷歌当天发布的Gemini 1.0共分为Ultra, Pro和Nano三个版本，其中Ultra的能力最强，复杂度最高，能够处理最为困难的任务，Pro能力稍弱，可以用来处理多任务，Nano则更注重于端侧的处理能力。\\n  Is Navigational Page?：No\\n\\n- Title：[清华 14 大 LLM 最新评测报告出炉：GPT-4 和 Claude-3 ...](https://www.ithome.com/0/763/013.htm)\\n  snippet：自大语言模型诞生之初，评测便成为大模型研究中不可或缺的一部分。 随着大模型研究的发展，对其性能重点的研究也在不断迁移。 根据我们的研究，大模型能力评测大概经历如下 5 个阶段： 2018 年-2021 年：语义评测阶段. 早期的语言模型主要关注自然语言的理解任务（ e.g. 分词、词性标注、句法分析、信息抽取)，相关评测主要考察语言模型对自然语言的语义理解能力。 代表工作：BERT、GPT、T5 等。 2021 年-2023 年：代码评测阶段. 随着语言模型能力的增强，更具应用价值的代码模型逐渐出现。 研究人员发现，基于代码生成任务训练的模型在测试中展现出更强的逻辑推理能力，代码模型成为研究热点。 代表工作：Codex、CodeLLaMa、CodeGeeX 等。\\n  Is Navigational Page?：No\\n\\n- Title：[ChatGPT浪潮下，看中国大语言模型产业发展 - 澎湃新闻](https://www.thepaper.cn/newsDetail_forward_26347873)\\n  snippet：大语言模型丨研究报告. 导语： ChatGPT这一现象级突围产品的横空出世，拉开了大语言模型产业和生成式AI（AIGC）产业蓬勃发展的序幕。 海外市场，OpenAI、微软、谷歌、Meta等巨头动作频频。 中国市场也百花齐放：百度、阿里、华为、腾讯、360、商汤、京东、科大讯飞、字节跳动等巨头厂商结合自身业务及战略布局，陆续宣布研发或已发布大语言模型产品；垂直赛道及大模型解决方案厂商则锚定一个或多个行业领域，意图打造“数据飞轮”护城河；应用层厂商则积极试水整合大模型能力，提升产品功能；众多科技大佬也宣布进军大模型领域进行创业。 市场热度高涨，中国人工智能产业迎来了难得的发展契机。 在热潮背后，产业的可持续发展，各类参与者的机会和价值点值得深思。\\n  Is Navigational Page?：No\"}"
}<|im_end|>
<|im_start|>assistant
, generate config: {'temperature': 0.0, 'tool_choice': 'auto', 'stream': True, 'stop': ['<|endoftext|>', '<|im_start|>', '<|im_end|>', 'Observation:'], 'stop_token_ids': [151643, 151644, 151645]}
2024-06-03 11:01:42,317 xinference.core.model 824361 DEBUG    After request chat, current serve request count: 0 for the model qwen:72b
2024-06-03 11:01:42,317 xinference.core.model 824361 DEBUG    Leave wrapped_func, elapsed time: 0 s
2024-06-03 11:01:58,354 xinference.model.llm.utils 824361 DEBUG    Tool call content: 最新的大语言模型新闻包括：

1. GPT-4-Turbo在各项评测中表现出最佳性能，国内厂商如智谱、阿里巴巴和百度也发布了新的模型，但与GPT-4仍存在差距。[[1](https://www.thepaper.cn/newsDetail_forward_26209606)]

2. 大语言模型的发展迅速，从T5到GPT-4，国内和国际的研究都在不断推进，但复杂推理能力仍然是挑战。[[2](https://www.thepaper.cn/newsDetail_forward_22557232)]

3. Meta发布了开源大模型Llama 3，该模型已应用于Meta的全系应用，并且亚马逊云科技也支持其部署和推理运行。[[3](https://www.thepaper.cn/newsDetail_forward_27106328)]

4. 清华大学智谱AI发布了GLM-4，其中文能力接近GPT-4，且支持更长的上下文和更快的推理速度。[[4](https://www.tsinghua.edu.cn/info/1182/109397.htm)]

5. 谷歌发布了大模型Gemini，这是其能力最强的通用人工智能模型，分为Ultra、Pro和Nano三个版本。[[8](https://new.qq.com/rain/a/20231207A00LFY00)]

这些新闻反映了大语言模型领域的快速发展和竞争，以及各公司和研究机构在提升模型性能和应用范围方面的努力。, func: None, args: None

正常工具调用xinference日志

2024-06-03 10:01:26,760 xinference.core.supervisor 20303 DEBUG    Enter get_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7f1ec561cf40>, 'qwen:72b'), kwargs: {}
2024-06-03 10:01:26,760 xinference.core.worker 20303 DEBUG    Enter get_model, args: (<xinference.core.worker.WorkerActor object at 0x7f203e884130>,), kwargs: {'model_uid': 'qwen:72b-1-0'}
2024-06-03 10:01:26,760 xinference.core.worker 20303 DEBUG    Leave get_model, elapsed time: 0 s
2024-06-03 10:01:26,760 xinference.core.supervisor 20303 DEBUG    Leave get_model, elapsed time: 0 s
2024-06-03 10:01:26,761 xinference.core.supervisor 20303 DEBUG    Enter describe_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7f1ec561cf40>, 'qwen:72b'), kwargs: {}
2024-06-03 10:01:26,761 xinference.core.worker 20303 DEBUG    Enter describe_model, args: (<xinference.core.worker.WorkerActor object at 0x7f203e884130>,), kwargs: {'model_uid': 'qwen:72b-1-0'}
2024-06-03 10:01:26,761 xinference.core.worker 20303 DEBUG    Leave describe_model, elapsed time: 0 s
2024-06-03 10:01:26,761 xinference.core.supervisor 20303 DEBUG    Leave describe_model, elapsed time: 0 s
2024-06-03 10:01:26,763 xinference.core.model 824361 DEBUG    Enter wrapped_func, args: (<xinference.core.model.ModelActor object at 0x7f2a1b304770>, '现在是几点', '你的任务是：根据问题生成搜索关键词，调用Bing网络查询。\n在调用Bing网络查询时，请确保生成的搜索词准确反映了问题的核心内容。在回答问题时，要充分利用知识库和网络查询的结果，以确保回答的准确性和完整性。\n\n\n\r', [], {'temperature': 0.0, 'tool_choice': 'auto', 'tools': <list_iterator object at 0x7f1f141d7a60>, 'stream': True}), kwargs: {}
2024-06-03 10:01:26,763 xinference.core.model 824361 DEBUG    Request chat, current serve request count: 0, request limit: None for the model qwen:72b
2024-06-03 10:01:26,763 xinference.model.llm.vllm.core 824361 DEBUG    Enter generate, prompt: <|im_start|>system
你的任务是：根据问题生成搜索关键词，调用Bing网络查询。
在调用Bing网络查询时，请确保生成的搜索词准确反映了问题的核心内容。在回答问题时，要充分利用知识库和网络查询的结果，以确保回答的准确性和完整性。

<|im_end|>
<|im_start|>user
Answer the following questions as best you can. You have access to the following APIs:

eGRfoB: Call this tool to interact with the eGRfoB API. What is the eGRfoB API useful for? 获取用户当前时区的时间。 Parameters: [] Format the arguments as a JSON object.

ozhjjr: Call this tool to interact with the ozhjjr API. What is the ozhjjr API useful for? Performs a Bing search for the given search key. Parameters: [{"name": "searchKey", "type": "string", "description": "The search term to use in the Bing search."}] Format the arguments as a JSON object.

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [eGRfoB, ozhjjr]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: 现在是几点<|im_end|>
<|im_start|>assistant
, generate config: {'temperature': 0.0, 'tool_choice': 'auto', 'stream': True, 'stop': ['<|endoftext|>', '<|im_start|>', '<|im_end|>', 'Observation:'], 'stop_token_ids': [151643, 151644, 151645]}
2024-06-03 10:01:26,763 xinference.core.model 824361 DEBUG    After request chat, current serve request count: 0 for the model qwen:72b
2024-06-03 10:01:26,763 xinference.core.model 824361 DEBUG    Leave wrapped_func, elapsed time: 0 s
2024-06-03 10:01:28,381 xinference.model.llm.utils 824361 DEBUG    Tool call content: 我需要调用eGRfoB API来获取当前时间, func: eGRfoB, args: {}
2024-06-03 10:01:28,470 xinference.core.supervisor 20303 DEBUG    Enter get_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7f1ec561cf40>, 'qwen:72b'), kwargs: {}
2024-06-03 10:01:28,470 xinference.core.worker 20303 DEBUG    Enter get_model, args: (<xinference.core.worker.WorkerActor object at 0x7f203e884130>,), kwargs: {'model_uid': 'qwen:72b-1-0'}
2024-06-03 10:01:28,470 xinference.core.worker 20303 DEBUG    Leave get_model, elapsed time: 0 s
2024-06-03 10:01:28,470 xinference.core.supervisor 20303 DEBUG    Leave get_model, elapsed time: 0 s
2024-06-03 10:01:28,471 xinference.core.supervisor 20303 DEBUG    Enter describe_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7f1ec561cf40>, 'qwen:72b'), kwargs: {}
2024-06-03 10:01:28,471 xinference.core.worker 20303 DEBUG    Enter describe_model, args: (<xinference.core.worker.WorkerActor object at 0x7f203e884130>,), kwargs: {'model_uid': 'qwen:72b-1-0'}
2024-06-03 10:01:28,471 xinference.core.worker 20303 DEBUG    Leave describe_model, elapsed time: 0 s
2024-06-03 10:01:28,471 xinference.core.supervisor 20303 DEBUG    Leave describe_model, elapsed time: 0 s
2024-06-03 10:01:28,473 xinference.core.model 824361 DEBUG    Enter wrapped_func, args: (<xinference.core.model.ModelActor object at 0x7f2a1b304770>, '<TOOL>', '你的任务是：根据问题生成搜索关键词，调用Bing网络查询。\n在调用Bing网络查询时，请确保生成的搜索词准确反映了问题的核心内容。在回答问题时，要充分利用知识库和网络查询的结果，以确保回答的准确性和完整性。\n\n\n\r', [{'role': 'user', 'content': '现在是几点'}, {'role': 'assistant', 'tool_calls': [{'id': 'e3a040d4-adf7-4f2a-8867-36edc155fc3c', 'type': 'function', 'function': {'name': 'eGRfoB', 'arguments': '{}'}}]}, {'tool_call_id': 'e3a040d4-adf7-4f2a-8867-36edc155fc3c', 'role': 'tool', 'name': 'eGRfoB', 'content': '{\n  "time": "2024-06-03 10:01:40 Monday"\n}'}], {'temperature': 0.0, 'tool_choice': 'auto', 'tools': <list_iterator object at 0x7f1f141d7cd0>, 'stream': True}), kwargs: {}
2024-06-03 10:01:28,473 xinference.core.model 824361 DEBUG    Request chat, current serve request count: 0, request limit: None for the model qwen:72b
2024-06-03 10:01:28,473 xinference.model.llm.vllm.core 824361 DEBUG    Enter generate, prompt: <|im_start|>system
你的任务是：根据问题生成搜索关键词，调用Bing网络查询。
在调用Bing网络查询时，请确保生成的搜索词准确反映了问题的核心内容。在回答问题时，要充分利用知识库和网络查询的结果，以确保回答的准确性和完整性。

<|im_end|>
<|im_start|>user
Answer the following questions as best you can. You have access to the following APIs:

eGRfoB: Call this tool to interact with the eGRfoB API. What is the eGRfoB API useful for? 获取用户当前时区的时间。 Parameters: [] Format the arguments as a JSON object.

ozhjjr: Call this tool to interact with the ozhjjr API. What is the ozhjjr API useful for? Performs a Bing search for the given search key. Parameters: [{"name": "searchKey", "type": "string", "description": "The search term to use in the Bing search."}] Format the arguments as a JSON object.

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [eGRfoB, ozhjjr]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: 现在是几点<|im_end|>
<|im_start|>assistant
Thought: I can use eGRfoB.
Action: eGRfoB
Action Input: {}<|im_end|>
<|im_start|>function
Observation: {
  "time": "2024-06-03 10:01:40 Monday"
}<|im_end|>
<|im_start|>assistant
, generate config: {'temperature': 0.0, 'tool_choice': 'auto', 'stream': True, 'stop': ['<|endoftext|>', '<|im_start|>', '<|im_end|>', 'Observation:'], 'stop_token_ids': [151643, 151644, 151645]}
2024-06-03 10:01:28,474 xinference.core.model 824361 DEBUG    After request chat, current serve request count: 0 for the model qwen:72b
2024-06-03 10:01:28,474 xinference.core.model 824361 DEBUG    Leave wrapped_func, elapsed time: 0 s
2024-06-03 10:01:31,089 xinference.model.llm.utils 824361 DEBUG    Tool call content: 现在是2024年6月3日星期一10点01分40秒。, func: None, args: None

config.json

{
      "model": "qwen:72b",
      "name": "qwen:72b",
      "maxContext": 32000,
      "avatar": "/imgs/model/qwen.svg",
      "maxResponse": 6000,
      "quoteMaxToken": 13000,
      "maxTemperature": 1.2,
      "charsPointsPrice": 0,
      "censor": false,
      "vision": false,
      "datasetProcess": false,
      "usedInClassify": true,
      "usedInExtractFields": true,
      "usedInToolCall": true,
      "usedInQueryExtension": true,
      "toolChoice": true,
      "functionCall": true,
      "customCQPrompt": "",
      "customExtractPrompt": "",
      "defaultSystemChatPrompt": "",
      "defaultConfig": {}
    },

zhanghx0905 commented 1 month ago

之前xinference在这方面的实现有问题，可以考虑设置 { "toolChoice": false, "functionCall": false } 走fastgpt内置的提示词

JinCheng666 commented 1 month ago

之前xinference在这方面的实现有问题，可以考虑设置 { "toolChoice": false, "functionCall": false } 走fastgpt内置的提示词

提示词的效果比较差，目前fastgpt已经适配了qwen的toolchoice，这个效果要比提示词效果好不少

zhanghx0905 commented 1 month ago

xinference也是用的提示词，和fastgpt顶多就是提示词内容上有差异。不知道qwen的saas api是如何实现tools call的，也许用的是类似guided grammar的策略

jacnmm4 commented 1 month ago

qwen1.5模型是支持原生工具调用的

jacnmm4 commented 1 month ago

之前xinference在这方面的实现有问题，可以考虑设置 { "toolChoice": false, "functionCall": false } 走fastgpt内置的提示词

提示词的效果比较差，目前fastgpt已经适配了qwen的toolchoice，这个效果要比提示词效果好不少

这个问题我也遇到了，很大的bug，出现概率非常大

c121914yu commented 1 month ago

可以 log 下 toolChoice 的内容，看看流是否正常。

JinCheng666 commented 1 month ago

可以 log 下 toolChoice 的内容，看看流是否正常。

@c121914yu 抱歉，没明白是查看哪个程序的log？我把fastgpt调试界面中，查看详情部分贴上来吧，这部分没问题。还有xinference的debug log

插件输出值

{
  "result": "{\"prompt\":\"The below set forth the Bing search results，you can use this realtime info，answer user's question。the Searchkey: 大语言模型 最新新闻; SearchResult:\\n\\n- Title：[大型语言模型综述全新出炉：从T5到GPT-4最全盘点，国内 ...](https://www.thepaper.cn/newsDetail_forward_22557232)\\n  snippet：2023-04-04 11:58. 来源：澎湃新闻·澎湃号·湃客. 字号. 机器之心报道. 机器之心编辑部. 为什么仿佛一夜之间，自然语言处理（NLP）领域就突然突飞猛进，摸到了通用人工智能的门槛？ 如今的大语言模型（LLM）发展到了什么程度？ 未来短时间内，AGI 的发展路线又将如何？ 自 20 世纪 50 年代图灵测试提出以来，人们始终在探索机器处理语言智能的能力。 语言本质上是一个错综复杂的人类表达系统，受到语法规则的约束。 因此，开发能够理解和精通语言的强大 AI 算法面临着巨大挑战。 过去二十年，语言建模方法被广泛用于语言理解和生成，包括统计语言模型和神经语言模型。\\n  Is Navigational Page?：Yes\\n\\n- Title：[大模型年度榜单公布：GPT-4第一，智谱、阿里紧追 - 澎湃新闻](https://www.thepaper.cn/newsDetail_forward_26209606)\\n  snippet：对过去一年来主流大模型全面评测诊断后，结果显示，GPT-4-Turbo在各项评测中均获最佳表现，国内厂商近期发布的模型紧随其后，包括智谱清言GLM-4、阿里巴巴Qwen-Max、百度文心一言4.0。 评测是大模型的指挥棒和指南针，OpenCompass为模型提供评测服务，量化模型在知识、语言、理解、推理和考试等五大能力维度的表现。 总体来看，大语言模型整体能力仍有较大提升空间，复杂推理相关能力仍是大模型普遍面临的难题，国内大模型相比于GPT-4还存在差距。 中文场景下国内最新大模型已展现出优势，在部分维度上接近GPT-4-Turbo的水平。 中英双语客观评测：数学、代码仍是短板.\\n  Is Navigational Page?：No\\n\\n- Title：[澎湃新闻 - 开源大模型Llama 3王者归来：最大底牌4000亿 ...](https://www.thepaper.cn/newsDetail_forward_27106328)\\n  snippet：本周四，AI 领域迎来重大消息，Meta 正式发布了人们等待已久的开源大模型 Llama 3。 扎克伯格在 Facebook 上发帖：Big AI news today. 与此同时，扎克伯格宣布：基于最新的 Llama 3 模型，Meta 的 AI 助手现在已经覆盖 Instagram、WhatsApp、Facebook 等全系应用，并单独开启了网站。 另外还有一个图像生成器，可根据自然语言提示词生成图片。 Meta AI 网址：https://www.meta.ai/ 同日，亚马逊云科技也宣布，Meta Llama 3 基础模型已可通过 Amazon SageMaker JumpStart来部署和推理运行。\\n  Is Navigational Page?：No\"}"
}

插件详情

body:
{
  "searchKey": "大语言模型 最新新闻"
}
响应体
{
  "prompt": "The below set forth the Bing search results，you can use this realtime info，answer user's question。the Searchkey: 大语言模型 最新新闻; SearchResult:\n\n- Title：[大型语言模型综述全新出炉：从T5到GPT-4最全盘点，国内 ...](https://www.thepaper.cn/newsDetail_forward_22557232)\n  snippet：2023-04-04 11:58. 来源：澎湃新闻·澎湃号·湃客. 字号. 机器之心报道. 机器之心编辑部. 为什么仿佛一夜之间，自然语言处理（NLP）领域就突然突飞猛进，摸到了通用人工智能的门槛？ 如今的大语言模型（LLM）发展到了什么程度？ 未来短时间内，AGI 的发展路线又将如何？ 自 20 世纪 50 年代图灵测试提出以来，人们始终在探索机器处理语言智能的能力。 语言本质上是一个错综复杂的人类表达系统，受到语法规则的约束。 因此，开发能够理解和精通语言的强大 AI 算法面临着巨大挑战。 过去二十年，语言建模方法被广泛用于语言理解和生成，包括统计语言模型和神经语言模型。\n  Is Navigational Page?：Yes\n\n- Title：[大模型年度榜单公布：GPT-4第一，智谱、阿里紧追 - 澎湃新闻](https://www.thepaper.cn/newsDetail_forward_26209606)\n  snippet：对过去一年来主流大模型全面评测诊断后，结果显示，GPT-4-Turbo在各项评测中均获最佳表现，国内厂商近期发布的模型紧随其后，包括智谱清言GLM-4、阿里巴巴Qwen-Max、百度文心一言4.0。 评测是大模型的指挥棒和指南针，OpenCompass为模型提供评测服务，量化模型在知识、语言、理解、推理和考试等五大能力维度的表现。 总体来看，大语言模型整体能力仍有较大提升空间，复杂推理相关能力仍是大模型普遍面临的难题，国内大模型相比于GPT-4还存在差距。 中文场景下国内最新大模型已展现出优势，在部分维度上接近GPT-4-Turbo的水平。 中英双语客观评测：数学、代码仍是短板.\n  Is Navigational Page?：No\n\n- Title：[澎湃新闻 - 开源大模型Llama 3王者归来：最大底牌4000亿 ...](https://www.thepaper.cn/newsDetail_forward_27106328)\n  snippet：本周四，AI 领域迎来重大消息，Meta 正式发布了人们等待已久的开源大模型 Llama 3。 扎克伯格在 Facebook 上发帖：Big AI news today. 与此同时，扎克伯格宣布：基于最新的 Llama 3 模型，Meta 的 AI 助手现在已经覆盖 Instagram、WhatsApp、Facebook 等全系应用，并单独开启了网站。 另外还有一个图像生成器，可根据自然语言提示词生成图片。 Meta AI 网址：https://www.meta.ai/ 同日，亚马逊云科技也宣布，Meta Llama 3 基础模型已可通过 Amazon SageMaker JumpStart来部署和推理运行。\n  Is Navigational Page?：No"
}

xinference的debug log

2024-06-04 08:36:45,493 xinference.core.supervisor 20303 DEBUG    Enter get_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7f1ec561cf40>, 'qwen:72b'), kwargs: {}
2024-06-04 08:36:45,494 xinference.core.worker 20303 DEBUG    Enter get_model, args: (<xinference.core.worker.WorkerActor object at 0x7f203e884130>,), kwargs: {'model_uid': 'qwen:72b-1-0'}
2024-06-04 08:36:45,494 xinference.core.worker 20303 DEBUG    Leave get_model, elapsed time: 0 s
2024-06-04 08:36:45,494 xinference.core.supervisor 20303 DEBUG    Leave get_model, elapsed time: 0 s
2024-06-04 08:36:45,495 xinference.core.supervisor 20303 DEBUG    Enter describe_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7f1ec561cf40>, 'qwen:72b'), kwargs: {}
2024-06-04 08:36:45,495 xinference.core.worker 20303 DEBUG    Enter describe_model, args: (<xinference.core.worker.WorkerActor object at 0x7f203e884130>,), kwargs: {'model_uid': 'qwen:72b-1-0'}
2024-06-04 08:36:45,495 xinference.core.worker 20303 DEBUG    Leave describe_model, elapsed time: 0 s
2024-06-04 08:36:45,495 xinference.core.supervisor 20303 DEBUG    Leave describe_model, elapsed time: 0 s
2024-06-04 08:36:45,497 xinference.core.model 824361 DEBUG    Enter wrapped_func, args: (<xinference.core.model.ModelActor object at 0x7f2a1b304770>, '查询大语言模型最新新闻', '你的任务是：根据问题生成搜索关键词，调用Bing网络查询。\n在调用Bing网络查询时，请确保生成的搜索词准确反映了问题的核心内容。在回答问题时，要充分利用知识库和网络查询的结果，以确保回答的准确性和完整性。\n\n\n\r', [], {'temperature': 0.0, 'tool_choice': 'auto', 'tools': <list_iterator object at 0x7f1f58166f20>, 'stream': True}), kwargs: {}
2024-06-04 08:36:45,497 xinference.core.model 824361 DEBUG    Request chat, current serve request count: 0, request limit: None for the model qwen:72b
2024-06-04 08:36:45,497 xinference.model.llm.vllm.core 824361 DEBUG    Enter generate, prompt: <|im_start|>system
你的任务是：根据问题生成搜索关键词，调用Bing网络查询。
在调用Bing网络查询时，请确保生成的搜索词准确反映了问题的核心内容。在回答问题时，要充分利用知识库和网络查询的结果，以确保回答的准确性和完整性。

<|im_end|>
<|im_start|>user
Answer the following questions as best you can. You have access to the following APIs:

oFKYtn: Call this tool to interact with the oFKYtn API. What is the oFKYtn API useful for? 获取用户当前时区的时间。 Parameters: [] Format the arguments as a JSON object.

l4Cnqy: Call this tool to interact with the l4Cnqy API. What is the l4Cnqy API useful for? Performs a Bing search for the given search key. Parameters: [{"name": "searchKey", "type": "string", "description": "The search term to use in the Bing search."}] Format the arguments as a JSON object.

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [oFKYtn, l4Cnqy]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: 查询大语言模型最新新闻<|im_end|>
<|im_start|>assistant
, generate config: {'temperature': 0.0, 'tool_choice': 'auto', 'stream': True, 'stop': ['<|endoftext|>', '<|im_start|>', '<|im_end|>', 'Observation:'], 'stop_token_ids': [151643, 151644, 151645]}
2024-06-04 08:36:45,497 xinference.core.model 824361 DEBUG    After request chat, current serve request count: 0 for the model qwen:72b
2024-06-04 08:36:45,498 xinference.core.model 824361 DEBUG    Leave wrapped_func, elapsed time: 0 s
2024-06-04 08:36:47,613 xinference.model.llm.utils 824361 DEBUG    Tool call content: 我需要使用Bing搜索来查找大语言模型的最新新闻。, func: l4Cnqy, args: {'searchKey': '大语言模型 最新新闻'}
2024-06-04 08:37:00,070 xinference.core.supervisor 20303 DEBUG    Enter get_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7f1ec561cf40>, 'qwen:72b'), kwargs: {}
2024-06-04 08:37:00,071 xinference.core.worker 20303 DEBUG    Enter get_model, args: (<xinference.core.worker.WorkerActor object at 0x7f203e884130>,), kwargs: {'model_uid': 'qwen:72b-1-0'}
2024-06-04 08:37:00,071 xinference.core.worker 20303 DEBUG    Leave get_model, elapsed time: 0 s
2024-06-04 08:37:00,071 xinference.core.supervisor 20303 DEBUG    Leave get_model, elapsed time: 0 s
2024-06-04 08:37:00,072 xinference.core.supervisor 20303 DEBUG    Enter describe_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7f1ec561cf40>, 'qwen:72b'), kwargs: {}
2024-06-04 08:37:00,072 xinference.core.worker 20303 DEBUG    Enter describe_model, args: (<xinference.core.worker.WorkerActor object at 0x7f203e884130>,), kwargs: {'model_uid': 'qwen:72b-1-0'}
2024-06-04 08:37:00,072 xinference.core.worker 20303 DEBUG    Leave describe_model, elapsed time: 0 s
2024-06-04 08:37:00,072 xinference.core.supervisor 20303 DEBUG    Leave describe_model, elapsed time: 0 s
2024-06-04 08:37:00,074 xinference.core.model 824361 DEBUG    Enter wrapped_func, args: (<xinference.core.model.ModelActor object at 0x7f2a1b304770>, '<TOOL>', '你的任务是：根据问题生成搜索关键词，调用Bing网络查询。\n在调用Bing网络查询时，请确保生成的搜索词准确反映了问题的核心内容。在回答问题时，要充分利用知识库和网络查询的结果，以确保回答的准确性和完整性。\n\n\n\r', [{'role': 'user', 'content': '查询大语言模型最新新闻'}, {'role': 'assistant', 'tool_calls': [{'id': '12fc88f9-477c-4565-aa80-944a61ca5b82', 'type': 'function', 'function': {'name': 'l4Cnqy', 'arguments': '{"searchKey": "大语言模型 最新新闻"}'}}]}, {'tool_call_id': '12fc88f9-477c-4565-aa80-944a61ca5b82', 'role': 'tool', 'name': 'l4Cnqy', 'content': '{\n  "result": "{\\"prompt\\":\\"The below set forth the Bing search results，you can use this realtime info，answer user\'s question。the Searchkey: 大语言模型 最新新闻; SearchResult:\\\\n\\\\n- Title：[大型语言模型综述全新出炉：从T5到GPT-4最全盘点，国内 ...](https://www.thepaper.cn/newsDetail_forward_22557232)\\\\n  snippet：2023-04-04 11:58. 来源：澎湃新闻·澎湃号·湃客. 字号. 机器之心报道. 机器之心编辑部. 为什么仿佛一夜之间，自然语言处理（NLP）领域就突然突飞猛进，摸到了通用人工智能的门槛？ 如今的大语言模型（LLM）发展到了什么程度？ 未来短时间内，AGI 的发展路线又将如何？ 自 20 世纪 50 年代图灵测试提出以来，人们始终在探索机器处理语言智能的能力。 语言本质上是一个错综复杂的人类表达系统，受到语法规则的约束。 因此，开发能够理解和精通语言的强大 AI 算法面临着巨大挑战。 过去二十年，语言建模方法被广泛用于语言理解和生成，包括统计语言模型和神经语言模型。\\\\n  Is Navigational Page?：Yes\\\\n\\\\n- Title：[大模型年度榜单公布：GPT-4第一，智谱、阿里紧追 - 澎湃新闻](https://www.thepaper.cn/newsDetail_forward_26209606)\\\\n  snippet：对过去一年来主流大模型全面评测诊断后，结果显示，GPT-4-Turbo在各项评测中均获最佳表现，国内厂商近期发布的模型紧随其后，包括智谱清言GLM-4、阿里巴巴Qwen-Max、百度文心一言4.0。 评测是大模型的指挥棒和指南针，OpenCompass为模型提供评测服务，量化模型在知识、语言、理解、推理和考试等五大能力维度的表现。 总体来看，大语言模型整体能力仍有较大提升空间，复杂推理相关能力仍是大模型普遍面临的难题，国内大模型相比于GPT-4还存在差距。 中文场景下国内最新大模型已展现出优势，在部分维度上接近GPT-4-Turbo的水平。 中英双语客观评测：数学、代码仍是短板.\\\\n  Is Navigational Page?：No\\\\n\\\\n- Title：[澎湃新闻 - 开源大模型Llama 3王者归来：最大底牌4000亿 ...](https://www.thepaper.cn/newsDetail_forward_27106328)\\\\n  snippet：本周四，AI 领域迎来重大消息，Meta 正式发布了人们等待已久的开源大模型 Llama 3。 扎克伯格在 Facebook 上发帖：Big AI news today. 与此同时，扎克伯格宣布：基于最新的 Llama 3 模型，Meta 的 AI 助手现在已经覆盖 Instagram、WhatsApp、Facebook 等全系应用，并单独开启了网站。 另外还有一个图像生成器，可根据自然语言提示词生成图片。 Meta AI 网址：https://www.meta.ai/ 同日，亚马逊云科技也宣布，Meta Llama 3 基础模型已可通过 Amazon SageMaker JumpStart来部署和推理运行。\\\\n  Is Navigational Page?：No\\"}"\n}'}], {'temperature': 0.0, 'tool_choice': 'auto', 'tools': <list_iterator object at 0x7f1f14186770>, 'stream': True}), kwargs: {}
2024-06-04 08:37:00,075 xinference.core.model 824361 DEBUG    Request chat, current serve request count: 0, request limit: None for the model qwen:72b
2024-06-04 08:37:00,075 xinference.model.llm.vllm.core 824361 DEBUG    Enter generate, prompt: <|im_start|>system
你的任务是：根据问题生成搜索关键词，调用Bing网络查询。
在调用Bing网络查询时，请确保生成的搜索词准确反映了问题的核心内容。在回答问题时，要充分利用知识库和网络查询的结果，以确保回答的准确性和完整性。

<|im_end|>
<|im_start|>user
Answer the following questions as best you can. You have access to the following APIs:

oFKYtn: Call this tool to interact with the oFKYtn API. What is the oFKYtn API useful for? 获取用户当前时区的时间。 Parameters: [] Format the arguments as a JSON object.

l4Cnqy: Call this tool to interact with the l4Cnqy API. What is the l4Cnqy API useful for? Performs a Bing search for the given search key. Parameters: [{"name": "searchKey", "type": "string", "description": "The search term to use in the Bing search."}] Format the arguments as a JSON object.

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [oFKYtn, l4Cnqy]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: 查询大语言模型最新新闻<|im_end|>
<|im_start|>assistant
Thought: I can use l4Cnqy.
Action: l4Cnqy
Action Input: {"searchKey": "大语言模型 最新新闻"}<|im_end|>
<|im_start|>function
Observation: {
  "result": "{\"prompt\":\"The below set forth the Bing search results，you can use this realtime info，answer user's question。the Searchkey: 大语言模型 最新新闻; SearchResult:\\n\\n- Title：[大型语言模型综述全新出炉：从T5到GPT-4最全盘点，国内 ...](https://www.thepaper.cn/newsDetail_forward_22557232)\\n  snippet：2023-04-04 11:58. 来源：澎湃新闻·澎湃号·湃客. 字号. 机器之心报道. 机器之心编辑部. 为什么仿佛一夜之间，自然语言处理（NLP）领域就突然突飞猛进，摸到了通用人工智能的门槛？ 如今的大语言模型（LLM）发展到了什么程度？ 未来短时间内，AGI 的发展路线又将如何？ 自 20 世纪 50 年代图灵测试提出以来，人们始终在探索机器处理语言智能的能力。 语言本质上是一个错综复杂的人类表达系统，受到语法规则的约束。 因此，开发能够理解和精通语言的强大 AI 算法面临着巨大挑战。 过去二十年，语言建模方法被广泛用于语言理解和生成，包括统计语言模型和神经语言模型。\\n  Is Navigational Page?：Yes\\n\\n- Title：[大模型年度榜单公布：GPT-4第一，智谱、阿里紧追 - 澎湃新闻](https://www.thepaper.cn/newsDetail_forward_26209606)\\n  snippet：对过去一年来主流大模型全面评测诊断后，结果显示，GPT-4-Turbo在各项评测中均获最佳表现，国内厂商近期发布的模型紧随其后，包括智谱清言GLM-4、阿里巴巴Qwen-Max、百度文心一言4.0。 评测是大模型的指挥棒和指南针，OpenCompass为模型提供评测服务，量化模型在知识、语言、理解、推理和考试等五大能力维度的表现。 总体来看，大语言模型整体能力仍有较大提升空间，复杂推理相关能力仍是大模型普遍面临的难题，国内大模型相比于GPT-4还存在差距。 中文场景下国内最新大模型已展现出优势，在部分维度上接近GPT-4-Turbo的水平。 中英双语客观评测：数学、代码仍是短板.\\n  Is Navigational Page?：No\\n\\n- Title：[澎湃新闻 - 开源大模型Llama 3王者归来：最大底牌4000亿 ...](https://www.thepaper.cn/newsDetail_forward_27106328)\\n  snippet：本周四，AI 领域迎来重大消息，Meta 正式发布了人们等待已久的开源大模型 Llama 3。 扎克伯格在 Facebook 上发帖：Big AI news today. 与此同时，扎克伯格宣布：基于最新的 Llama 3 模型，Meta 的 AI 助手现在已经覆盖 Instagram、WhatsApp、Facebook 等全系应用，并单独开启了网站。 另外还有一个图像生成器，可根据自然语言提示词生成图片。 Meta AI 网址：https://www.meta.ai/ 同日，亚马逊云科技也宣布，Meta Llama 3 基础模型已可通过 Amazon SageMaker JumpStart来部署和推理运行。\\n  Is Navigational Page?：No\"}"
}<|im_end|>
<|im_start|>assistant
, generate config: {'temperature': 0.0, 'tool_choice': 'auto', 'stream': True, 'stop': ['<|endoftext|>', '<|im_start|>', '<|im_end|>', 'Observation:'], 'stop_token_ids': [151643, 151644, 151645]}
2024-06-04 08:37:00,075 xinference.core.model 824361 DEBUG    After request chat, current serve request count: 0 for the model qwen:72b
2024-06-04 08:37:00,075 xinference.core.model 824361 DEBUG    Leave wrapped_func, elapsed time: 0 s
2024-06-04 08:37:10,720 xinference.model.llm.utils 824361 DEBUG    Tool call content: 最近的大语言模型新闻包括：

1. 一篇来自澎湃新闻的文章提到，2023年4月4日，自然语言处理（NLP）领域取得了显著进步，大语言模型（LLM）的发展已经接近通用人工智能的门槛。文章提到了从T5到GPT-4的最全盘点，并指出国内的模型也在快速发展[[1]()]。

2. 另一篇来自澎湃新闻的报道指出，OpenCompass对过去一年的大模型进行了全面评测，结果显示GPT-4-Turbo表现出色，国内的智谱清言GLM-4、阿里巴巴Qwen-Max和百度文心一言4.0紧随其后。尽管国内模型在某些方面接近GPT-4，但在复杂推理能力上仍有提升空间[[2]()]。

3. Meta公司发布了开源大模型Llama 3，扎克伯格宣布这个模型已应用于Meta的全系应用，包括Instagram、WhatsApp和Facebook。此外，Llama 3还支持图像生成功能[[3]()]。

这些新闻反映了大语言模型领域的快速发展和竞争，以及技术在不同应用场景中的应用。, func: None, args: None
2024-06-04 08:37:10,774 xoscar.backends.core 20232 WARNING  Actor caller has created too many clients (820 >= 100), the global router may not be set.

c121914yu commented 1 month ago

可以 log 下 toolChoice 的内容，看看流是否正常。

抱歉，没明白是查看哪个程序的log？我把fastgpt调试界面中，查看详情部分贴上来吧，这部分没问题。还有xinference的debug log

插件输出值

{
  "result": "{\"prompt\":\"The below set forth the Bing search results，you can use this realtime info，answer user's question。the Searchkey: 大语言模型 最新新闻; SearchResult:\\n\\n- Title：[大型语言模型综述全新出炉：从T5到GPT-4最全盘点，国内 ...](https://www.thepaper.cn/newsDetail_forward_22557232)\\n  snippet：2023-04-04 11:58. 来源：澎湃新闻·澎湃号·湃客. 字号. 机器之心报道. 机器之心编辑部. 为什么仿佛一夜之间，自然语言处理（NLP）领域就突然突飞猛进，摸到了通用人工智能的门槛？ 如今的大语言模型（LLM）发展到了什么程度？ 未来短时间内，AGI 的发展路线又将如何？ 自 20 世纪 50 年代图灵测试提出以来，人们始终在探索机器处理语言智能的能力。 语言本质上是一个错综复杂的人类表达系统，受到语法规则的约束。 因此，开发能够理解和精通语言的强大 AI 算法面临着巨大挑战。 过去二十年，语言建模方法被广泛用于语言理解和生成，包括统计语言模型和神经语言模型。\\n  Is Navigational Page?：Yes\\n\\n- Title：[大模型年度榜单公布：GPT-4第一，智谱、阿里紧追 - 澎湃新闻](https://www.thepaper.cn/newsDetail_forward_26209606)\\n  snippet：对过去一年来主流大模型全面评测诊断后，结果显示，GPT-4-Turbo在各项评测中均获最佳表现，国内厂商近期发布的模型紧随其后，包括智谱清言GLM-4、阿里巴巴Qwen-Max、百度文心一言4.0。 评测是大模型的指挥棒和指南针，OpenCompass为模型提供评测服务，量化模型在知识、语言、理解、推理和考试等五大能力维度的表现。 总体来看，大语言模型整体能力仍有较大提升空间，复杂推理相关能力仍是大模型普遍面临的难题，国内大模型相比于GPT-4还存在差距。 中文场景下国内最新大模型已展现出优势，在部分维度上接近GPT-4-Turbo的水平。 中英双语客观评测：数学、代码仍是短板.\\n  Is Navigational Page?：No\\n\\n- Title：[澎湃新闻 - 开源大模型Llama 3王者归来：最大底牌4000亿 ...](https://www.thepaper.cn/newsDetail_forward_27106328)\\n  snippet：本周四，AI 领域迎来重大消息，Meta 正式发布了人们等待已久的开源大模型 Llama 3。 扎克伯格在 Facebook 上发帖：Big AI news today. 与此同时，扎克伯格宣布：基于最新的 Llama 3 模型，Meta 的 AI 助手现在已经覆盖 Instagram、WhatsApp、Facebook 等全系应用，并单独开启了网站。 另外还有一个图像生成器，可根据自然语言提示词生成图片。 Meta AI 网址：https://www.meta.ai/ 同日，亚马逊云科技也宣布，Meta Llama 3 基础模型已可通过 Amazon SageMaker JumpStart来部署和推理运行。\\n  Is Navigational Page?：No\"}"
}

插件详情

body:
{
  "searchKey": "大语言模型 最新新闻"
}
响应体
{
  "prompt": "The below set forth the Bing search results，you can use this realtime info，answer user's question。the Searchkey: 大语言模型 最新新闻; SearchResult:\n\n- Title：[大型语言模型综述全新出炉：从T5到GPT-4最全盘点，国内 ...](https://www.thepaper.cn/newsDetail_forward_22557232)\n  snippet：2023-04-04 11:58. 来源：澎湃新闻·澎湃号·湃客. 字号. 机器之心报道. 机器之心编辑部. 为什么仿佛一夜之间，自然语言处理（NLP）领域就突然突飞猛进，摸到了通用人工智能的门槛？ 如今的大语言模型（LLM）发展到了什么程度？ 未来短时间内，AGI 的发展路线又将如何？ 自 20 世纪 50 年代图灵测试提出以来，人们始终在探索机器处理语言智能的能力。 语言本质上是一个错综复杂的人类表达系统，受到语法规则的约束。 因此，开发能够理解和精通语言的强大 AI 算法面临着巨大挑战。 过去二十年，语言建模方法被广泛用于语言理解和生成，包括统计语言模型和神经语言模型。\n  Is Navigational Page?：Yes\n\n- Title：[大模型年度榜单公布：GPT-4第一，智谱、阿里紧追 - 澎湃新闻](https://www.thepaper.cn/newsDetail_forward_26209606)\n  snippet：对过去一年来主流大模型全面评测诊断后，结果显示，GPT-4-Turbo在各项评测中均获最佳表现，国内厂商近期发布的模型紧随其后，包括智谱清言GLM-4、阿里巴巴Qwen-Max、百度文心一言4.0。 评测是大模型的指挥棒和指南针，OpenCompass为模型提供评测服务，量化模型在知识、语言、理解、推理和考试等五大能力维度的表现。 总体来看，大语言模型整体能力仍有较大提升空间，复杂推理相关能力仍是大模型普遍面临的难题，国内大模型相比于GPT-4还存在差距。 中文场景下国内最新大模型已展现出优势，在部分维度上接近GPT-4-Turbo的水平。 中英双语客观评测：数学、代码仍是短板.\n  Is Navigational Page?：No\n\n- Title：[澎湃新闻 - 开源大模型Llama 3王者归来：最大底牌4000亿 ...](https://www.thepaper.cn/newsDetail_forward_27106328)\n  snippet：本周四，AI 领域迎来重大消息，Meta 正式发布了人们等待已久的开源大模型 Llama 3。 扎克伯格在 Facebook 上发帖：Big AI news today. 与此同时，扎克伯格宣布：基于最新的 Llama 3 模型，Meta 的 AI 助手现在已经覆盖 Instagram、WhatsApp、Facebook 等全系应用，并单独开启了网站。 另外还有一个图像生成器，可根据自然语言提示词生成图片。 Meta AI 网址：https://www.meta.ai/ 同日，亚马逊云科技也宣布，Meta Llama 3 基础模型已可通过 Amazon SageMaker JumpStart来部署和推理运行。\n  Is Navigational Page?：No"
}

xinference的debug log

2024-06-04 08:36:45,493 xinference.core.supervisor 20303 DEBUG    Enter get_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7f1ec561cf40>, 'qwen:72b'), kwargs: {}
2024-06-04 08:36:45,494 xinference.core.worker 20303 DEBUG    Enter get_model, args: (<xinference.core.worker.WorkerActor object at 0x7f203e884130>,), kwargs: {'model_uid': 'qwen:72b-1-0'}
2024-06-04 08:36:45,494 xinference.core.worker 20303 DEBUG    Leave get_model, elapsed time: 0 s
2024-06-04 08:36:45,494 xinference.core.supervisor 20303 DEBUG    Leave get_model, elapsed time: 0 s
2024-06-04 08:36:45,495 xinference.core.supervisor 20303 DEBUG    Enter describe_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7f1ec561cf40>, 'qwen:72b'), kwargs: {}
2024-06-04 08:36:45,495 xinference.core.worker 20303 DEBUG    Enter describe_model, args: (<xinference.core.worker.WorkerActor object at 0x7f203e884130>,), kwargs: {'model_uid': 'qwen:72b-1-0'}
2024-06-04 08:36:45,495 xinference.core.worker 20303 DEBUG    Leave describe_model, elapsed time: 0 s
2024-06-04 08:36:45,495 xinference.core.supervisor 20303 DEBUG    Leave describe_model, elapsed time: 0 s
2024-06-04 08:36:45,497 xinference.core.model 824361 DEBUG    Enter wrapped_func, args: (<xinference.core.model.ModelActor object at 0x7f2a1b304770>, '查询大语言模型最新新闻', '你的任务是：根据问题生成搜索关键词，调用Bing网络查询。\n在调用Bing网络查询时，请确保生成的搜索词准确反映了问题的核心内容。在回答问题时，要充分利用知识库和网络查询的结果，以确保回答的准确性和完整性。\n\n\n\r', [], {'temperature': 0.0, 'tool_choice': 'auto', 'tools': <list_iterator object at 0x7f1f58166f20>, 'stream': True}), kwargs: {}
2024-06-04 08:36:45,497 xinference.core.model 824361 DEBUG    Request chat, current serve request count: 0, request limit: None for the model qwen:72b
2024-06-04 08:36:45,497 xinference.model.llm.vllm.core 824361 DEBUG    Enter generate, prompt: <|im_start|>system
你的任务是：根据问题生成搜索关键词，调用Bing网络查询。
在调用Bing网络查询时，请确保生成的搜索词准确反映了问题的核心内容。在回答问题时，要充分利用知识库和网络查询的结果，以确保回答的准确性和完整性。

<|im_end|>
<|im_start|>user
Answer the following questions as best you can. You have access to the following APIs:

oFKYtn: Call this tool to interact with the oFKYtn API. What is the oFKYtn API useful for? 获取用户当前时区的时间。 Parameters: [] Format the arguments as a JSON object.

l4Cnqy: Call this tool to interact with the l4Cnqy API. What is the l4Cnqy API useful for? Performs a Bing search for the given search key. Parameters: [{"name": "searchKey", "type": "string", "description": "The search term to use in the Bing search."}] Format the arguments as a JSON object.

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [oFKYtn, l4Cnqy]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: 查询大语言模型最新新闻<|im_end|>
<|im_start|>assistant
, generate config: {'temperature': 0.0, 'tool_choice': 'auto', 'stream': True, 'stop': ['<|endoftext|>', '<|im_start|>', '<|im_end|>', 'Observation:'], 'stop_token_ids': [151643, 151644, 151645]}
2024-06-04 08:36:45,497 xinference.core.model 824361 DEBUG    After request chat, current serve request count: 0 for the model qwen:72b
2024-06-04 08:36:45,498 xinference.core.model 824361 DEBUG    Leave wrapped_func, elapsed time: 0 s
2024-06-04 08:36:47,613 xinference.model.llm.utils 824361 DEBUG    Tool call content: 我需要使用Bing搜索来查找大语言模型的最新新闻。, func: l4Cnqy, args: {'searchKey': '大语言模型 最新新闻'}
2024-06-04 08:37:00,070 xinference.core.supervisor 20303 DEBUG    Enter get_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7f1ec561cf40>, 'qwen:72b'), kwargs: {}
2024-06-04 08:37:00,071 xinference.core.worker 20303 DEBUG    Enter get_model, args: (<xinference.core.worker.WorkerActor object at 0x7f203e884130>,), kwargs: {'model_uid': 'qwen:72b-1-0'}
2024-06-04 08:37:00,071 xinference.core.worker 20303 DEBUG    Leave get_model, elapsed time: 0 s
2024-06-04 08:37:00,071 xinference.core.supervisor 20303 DEBUG    Leave get_model, elapsed time: 0 s
2024-06-04 08:37:00,072 xinference.core.supervisor 20303 DEBUG    Enter describe_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7f1ec561cf40>, 'qwen:72b'), kwargs: {}
2024-06-04 08:37:00,072 xinference.core.worker 20303 DEBUG    Enter describe_model, args: (<xinference.core.worker.WorkerActor object at 0x7f203e884130>,), kwargs: {'model_uid': 'qwen:72b-1-0'}
2024-06-04 08:37:00,072 xinference.core.worker 20303 DEBUG    Leave describe_model, elapsed time: 0 s
2024-06-04 08:37:00,072 xinference.core.supervisor 20303 DEBUG    Leave describe_model, elapsed time: 0 s
2024-06-04 08:37:00,074 xinference.core.model 824361 DEBUG    Enter wrapped_func, args: (<xinference.core.model.ModelActor object at 0x7f2a1b304770>, '<TOOL>', '你的任务是：根据问题生成搜索关键词，调用Bing网络查询。\n在调用Bing网络查询时，请确保生成的搜索词准确反映了问题的核心内容。在回答问题时，要充分利用知识库和网络查询的结果，以确保回答的准确性和完整性。\n\n\n\r', [{'role': 'user', 'content': '查询大语言模型最新新闻'}, {'role': 'assistant', 'tool_calls': [{'id': '12fc88f9-477c-4565-aa80-944a61ca5b82', 'type': 'function', 'function': {'name': 'l4Cnqy', 'arguments': '{"searchKey": "大语言模型 最新新闻"}'}}]}, {'tool_call_id': '12fc88f9-477c-4565-aa80-944a61ca5b82', 'role': 'tool', 'name': 'l4Cnqy', 'content': '{\n  "result": "{\\"prompt\\":\\"The below set forth the Bing search results，you can use this realtime info，answer user\'s question。the Searchkey: 大语言模型 最新新闻; SearchResult:\\\\n\\\\n- Title：[大型语言模型综述全新出炉：从T5到GPT-4最全盘点，国内 ...](https://www.thepaper.cn/newsDetail_forward_22557232)\\\\n  snippet：2023-04-04 11:58. 来源：澎湃新闻·澎湃号·湃客. 字号. 机器之心报道. 机器之心编辑部. 为什么仿佛一夜之间，自然语言处理（NLP）领域就突然突飞猛进，摸到了通用人工智能的门槛？ 如今的大语言模型（LLM）发展到了什么程度？ 未来短时间内，AGI 的发展路线又将如何？ 自 20 世纪 50 年代图灵测试提出以来，人们始终在探索机器处理语言智能的能力。 语言本质上是一个错综复杂的人类表达系统，受到语法规则的约束。 因此，开发能够理解和精通语言的强大 AI 算法面临着巨大挑战。 过去二十年，语言建模方法被广泛用于语言理解和生成，包括统计语言模型和神经语言模型。\\\\n  Is Navigational Page?：Yes\\\\n\\\\n- Title：[大模型年度榜单公布：GPT-4第一，智谱、阿里紧追 - 澎湃新闻](https://www.thepaper.cn/newsDetail_forward_26209606)\\\\n  snippet：对过去一年来主流大模型全面评测诊断后，结果显示，GPT-4-Turbo在各项评测中均获最佳表现，国内厂商近期发布的模型紧随其后，包括智谱清言GLM-4、阿里巴巴Qwen-Max、百度文心一言4.0。 评测是大模型的指挥棒和指南针，OpenCompass为模型提供评测服务，量化模型在知识、语言、理解、推理和考试等五大能力维度的表现。 总体来看，大语言模型整体能力仍有较大提升空间，复杂推理相关能力仍是大模型普遍面临的难题，国内大模型相比于GPT-4还存在差距。 中文场景下国内最新大模型已展现出优势，在部分维度上接近GPT-4-Turbo的水平。 中英双语客观评测：数学、代码仍是短板.\\\\n  Is Navigational Page?：No\\\\n\\\\n- Title：[澎湃新闻 - 开源大模型Llama 3王者归来：最大底牌4000亿 ...](https://www.thepaper.cn/newsDetail_forward_27106328)\\\\n  snippet：本周四，AI 领域迎来重大消息，Meta 正式发布了人们等待已久的开源大模型 Llama 3。 扎克伯格在 Facebook 上发帖：Big AI news today. 与此同时，扎克伯格宣布：基于最新的 Llama 3 模型，Meta 的 AI 助手现在已经覆盖 Instagram、WhatsApp、Facebook 等全系应用，并单独开启了网站。 另外还有一个图像生成器，可根据自然语言提示词生成图片。 Meta AI 网址：https://www.meta.ai/ 同日，亚马逊云科技也宣布，Meta Llama 3 基础模型已可通过 Amazon SageMaker JumpStart来部署和推理运行。\\\\n  Is Navigational Page?：No\\"}"\n}'}], {'temperature': 0.0, 'tool_choice': 'auto', 'tools': <list_iterator object at 0x7f1f14186770>, 'stream': True}), kwargs: {}
2024-06-04 08:37:00,075 xinference.core.model 824361 DEBUG    Request chat, current serve request count: 0, request limit: None for the model qwen:72b
2024-06-04 08:37:00,075 xinference.model.llm.vllm.core 824361 DEBUG    Enter generate, prompt: <|im_start|>system
你的任务是：根据问题生成搜索关键词，调用Bing网络查询。
在调用Bing网络查询时，请确保生成的搜索词准确反映了问题的核心内容。在回答问题时，要充分利用知识库和网络查询的结果，以确保回答的准确性和完整性。

<|im_end|>
<|im_start|>user
Answer the following questions as best you can. You have access to the following APIs:

oFKYtn: Call this tool to interact with the oFKYtn API. What is the oFKYtn API useful for? 获取用户当前时区的时间。 Parameters: [] Format the arguments as a JSON object.

l4Cnqy: Call this tool to interact with the l4Cnqy API. What is the l4Cnqy API useful for? Performs a Bing search for the given search key. Parameters: [{"name": "searchKey", "type": "string", "description": "The search term to use in the Bing search."}] Format the arguments as a JSON object.

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [oFKYtn, l4Cnqy]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: 查询大语言模型最新新闻<|im_end|>
<|im_start|>assistant
Thought: I can use l4Cnqy.
Action: l4Cnqy
Action Input: {"searchKey": "大语言模型 最新新闻"}<|im_end|>
<|im_start|>function
Observation: {
  "result": "{\"prompt\":\"The below set forth the Bing search results，you can use this realtime info，answer user's question。the Searchkey: 大语言模型 最新新闻; SearchResult:\\n\\n- Title：[大型语言模型综述全新出炉：从T5到GPT-4最全盘点，国内 ...](https://www.thepaper.cn/newsDetail_forward_22557232)\\n  snippet：2023-04-04 11:58. 来源：澎湃新闻·澎湃号·湃客. 字号. 机器之心报道. 机器之心编辑部. 为什么仿佛一夜之间，自然语言处理（NLP）领域就突然突飞猛进，摸到了通用人工智能的门槛？ 如今的大语言模型（LLM）发展到了什么程度？ 未来短时间内，AGI 的发展路线又将如何？ 自 20 世纪 50 年代图灵测试提出以来，人们始终在探索机器处理语言智能的能力。 语言本质上是一个错综复杂的人类表达系统，受到语法规则的约束。 因此，开发能够理解和精通语言的强大 AI 算法面临着巨大挑战。 过去二十年，语言建模方法被广泛用于语言理解和生成，包括统计语言模型和神经语言模型。\\n  Is Navigational Page?：Yes\\n\\n- Title：[大模型年度榜单公布：GPT-4第一，智谱、阿里紧追 - 澎湃新闻](https://www.thepaper.cn/newsDetail_forward_26209606)\\n  snippet：对过去一年来主流大模型全面评测诊断后，结果显示，GPT-4-Turbo在各项评测中均获最佳表现，国内厂商近期发布的模型紧随其后，包括智谱清言GLM-4、阿里巴巴Qwen-Max、百度文心一言4.0。 评测是大模型的指挥棒和指南针，OpenCompass为模型提供评测服务，量化模型在知识、语言、理解、推理和考试等五大能力维度的表现。 总体来看，大语言模型整体能力仍有较大提升空间，复杂推理相关能力仍是大模型普遍面临的难题，国内大模型相比于GPT-4还存在差距。 中文场景下国内最新大模型已展现出优势，在部分维度上接近GPT-4-Turbo的水平。 中英双语客观评测：数学、代码仍是短板.\\n  Is Navigational Page?：No\\n\\n- Title：[澎湃新闻 - 开源大模型Llama 3王者归来：最大底牌4000亿 ...](https://www.thepaper.cn/newsDetail_forward_27106328)\\n  snippet：本周四，AI 领域迎来重大消息，Meta 正式发布了人们等待已久的开源大模型 Llama 3。 扎克伯格在 Facebook 上发帖：Big AI news today. 与此同时，扎克伯格宣布：基于最新的 Llama 3 模型，Meta 的 AI 助手现在已经覆盖 Instagram、WhatsApp、Facebook 等全系应用，并单独开启了网站。 另外还有一个图像生成器，可根据自然语言提示词生成图片。 Meta AI 网址：https://www.meta.ai/ 同日，亚马逊云科技也宣布，Meta Llama 3 基础模型已可通过 Amazon SageMaker JumpStart来部署和推理运行。\\n  Is Navigational Page?：No\"}"
}<|im_end|>
<|im_start|>assistant
, generate config: {'temperature': 0.0, 'tool_choice': 'auto', 'stream': True, 'stop': ['<|endoftext|>', '<|im_start|>', '<|im_end|>', 'Observation:'], 'stop_token_ids': [151643, 151644, 151645]}
2024-06-04 08:37:00,075 xinference.core.model 824361 DEBUG    After request chat, current serve request count: 0 for the model qwen:72b
2024-06-04 08:37:00,075 xinference.core.model 824361 DEBUG    Leave wrapped_func, elapsed time: 0 s
2024-06-04 08:37:10,720 xinference.model.llm.utils 824361 DEBUG    Tool call content: 最近的大语言模型新闻包括：

1. 一篇来自澎湃新闻的文章提到，2023年4月4日，自然语言处理（NLP）领域取得了显著进步，大语言模型（LLM）的发展已经接近通用人工智能的门槛。文章提到了从T5到GPT-4的最全盘点，并指出国内的模型也在快速发展[[1]()]。

2. 另一篇来自澎湃新闻的报道指出，OpenCompass对过去一年的大模型进行了全面评测，结果显示GPT-4-Turbo表现出色，国内的智谱清言GLM-4、阿里巴巴Qwen-Max和百度文心一言4.0紧随其后。尽管国内模型在某些方面接近GPT-4，但在复杂推理能力上仍有提升空间[[2]()]。

3. Meta公司发布了开源大模型Llama 3，扎克伯格宣布这个模型已应用于Meta的全系应用，包括Instagram、WhatsApp和Facebook。此外，Llama 3还支持图像生成功能[[3]()]。

这些新闻反映了大语言模型领域的快速发展和竞争，以及技术在不同应用场景中的应用。, func: None, args: None
2024-06-04 08:37:10,774 xoscar.backends.core 20232 WARNING  Actor caller has created too many clients (820 >= 100), the global router may not be set.

需要修改 fastgpt 代码，打印对应流输出值，看看流是否正常返回，以及是否正常捕获流。

stevensy123 commented 1 month ago

我也遇到同样的问题，工具调用没问题，有结果返回，在最后的输出上出现截断，只有最后的几个字。同时后台xinference能显示输出token的速度，也是qwen

stevensy123 commented 1 month ago

![Uploading IMG_20240604_164825_edit_1036076628037216.jpg…]()

JinCheng666 commented 1 month ago

需要修改 fastgpt 代码，打印对应流输出值，看看流是否正常返回，以及是否正常捕获流。

感谢，我们还没有用代码启动过，暂时没人掌握nextjs技术，可能得先学一下了。方便的话，能否指示一下具体是改哪个文件？再次感谢 @c121914yu

zhanghx0905 commented 1 month ago

@JinCheng666 请尝试一下https://github.com/zhanghx0905/inference 看看能否解决问题

jacnmm4 commented 1 month ago

大佬们，问题解决了吗，我也是用qwen大模型，xinference上显示都是完整的输出log，fastchat前端却是被截断，只显示最后几个字符，并且问题非常频繁，很容易重现，感觉有50%的概率，主要是使用“函数调用工具”时，会出现

JinCheng666 commented 1 month ago

@JinCheng666 请尝试一下https://github.com/zhanghx0905/inference 看看能否解决问题

@zhanghx0905 请问是尝试在这个项目里问一下？我看就是fork了xinference，有什么区别吗？

zhanghx0905 commented 1 month ago

请问是尝试在这个项目里问一下？我看就是fork了xinference，有什么区别吗？

这个branch为解决这个问题打了补丁，下载到本地pip install一下，看看能否满足要求

JinCheng666 commented 2 weeks ago

回复异常截断问题解决了，我关闭此问题。目前出现的问题是输出了参考过程，我新开了issue

https://github.com/labring/FastGPT/issues/1811

labring / FastGPT

qwen工具调用，回复被异常截断 #1668