modelscope / dash-infer

DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including x86 and ARMv9.
Apache License 2.0
135 stars 15 forks source link

当prompt_token 超过模型支持的最大长度时, 程序就恢复不了, 一直返回报错信息 #28

Closed yejunjin closed 1 month ago

yejunjin commented 4 months ago

在dashinfer集成进fastchat过程中,当prompt token超过engine_max_length时 当.generation_config.max_length < prompt token < .engine_config.engine_max_length,程序恢复不了。

yejunjin commented 4 months ago

当config.json文件中,.engine_config.engine_max_length = 128, .generation_config.max_length = 64, 输入提问长度80左右,就会复现。

复现日志如下: image

原因:主要是https://github.com/modelscope/dash-infer/blob/40cddfd6b4cc0a0c75141c3cf5fd35a572c4d3b9/csrc/core/model/model.cpp#L403 这段代码没有进行status判断,返回了错误状态也继续执行了。

chuanzhubin commented 4 months ago