当prompt_token 超过模型支持的最大长度时, 程序就恢复不了, 一直返回报错信息

modelscope / dash-infer

DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including x86 and ARMv9.

Apache License 2.0

135 stars 15 forks source link

当prompt_token 超过模型支持的最大长度时, 程序就恢复不了, 一直返回报错信息 #28

Closed yejunjin closed 1 month ago

yejunjin commented 4 months ago

在dashinfer集成进fastchat过程中，~~当prompt token超过engine_max_length时~~ 当.generation_config.max_length < prompt token < .engine_config.engine_max_length，程序恢复不了。

yejunjin commented 4 months ago

当config.json文件中，.engine_config.engine_max_length = 128, .generation_config.max_length = 64, 输入提问长度80左右，就会复现。

复现日志如下：

原因：主要是https://github.com/modelscope/dash-infer/blob/40cddfd6b4cc0a0c75141c3cf5fd35a572c4d3b9/csrc/core/model/model.cpp#L403 这段代码没有进行status判断，返回了错误状态也继续执行了。

chuanzhubin commented 4 months ago