微调完多卡推理时报精度不对的问题 expected scalar type Half but found Float ，单卡推理就没有这个问题

mymusise / ChatGLM-Tuning

基于ChatGLM-6B + LoRA的Fintune方案

MIT License

3.71k stars 443 forks source link

微调完多卡推理时报精度不对的问题 expected scalar type Half but found Float ，单卡推理就没有这个问题 #224

Open Tungsong opened 1 year ago

Tungsong commented 1 year ago

多卡推理参考的ChatGLM-6B官方的多卡部署 https://github.com/THUDM/ChatGLM-6B#%E5%A4%9A%E5%8D%A1%E9%83%A8%E7%BD%B2

applepieiris commented 1 year ago

遇到同样的问题，之后我在inference.py内加了： device = torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu") input_ids = input_ids.to(device) # 该句加在input_ids = torch.LongTensor([ids])后面 后面用CUDA_VISIBLE_DEVICES=0 python inference.py 命令还是报这个错