Closed EvoNexusX closed 1 month ago
我加载模型的代码是这样的: if self.TYPE == "lmdeploy" or "internlm" in self.MODEL_NAME: self.TYPE = "lmdeploy" lora_path = '/media/xpl/046f618a-eb06-418b-8a93-d21606fd30c2/tianchi_docker/app/models/llms/Qwen2.5-32B-Instruct-AWQ-lora/checkpoint-354' # 这里改称你的 lora 输出对应 checkpoint 地址
backend_config = PytorchEngineConfig(session_len=2048,
adapters=dict(lora_name_1=lora_path))
# backend_config = PytorchEngineConfig(tp=1, block_size=32, cache_max_entry_count=0.2)
self.LLM = pipeline(self.MODEL_PATH, backend_config=backend_config)
Checklist
Describe the bug
我采用qwen2.5 32b 4bit-awq并进行了微调,通过测试,能够正常通过transformer推理,lmdeploy也可以正常推理。但是我在进行批量推理的时候,比如说prompts数量变多的时候,就会出现只生成感叹号!的情况,我不清楚是什么原因 即
Reproduction
1
Environment
Error traceback