是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
[X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
[X] 我已经搜索过FAQ | I have searched FAQ
当前行为 | Current Behavior
在两张16G的3080显卡上按照官方给出的多卡部署MiniCPM-Llama3-V能够成功部署并进行推理,但是多卡部署OmniLmm12B时虽然像给出的指示设置了
device_map["model.embed_tokens"] = 0
device_map["model.layers.0"] = 0
device_map["model.layers.31"] = 0
device_map["model.norm"] = 0
device_map["model.resampler"] = 0
device_map["model.vision_tower"] = 0
device_map["lm_head"] = 0
保证输入输出在同一张显卡,但仍然给出数据不在一张显卡的报错。Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
当前行为 | Current Behavior
在两张16G的3080显卡上按照官方给出的多卡部署MiniCPM-Llama3-V能够成功部署并进行推理,但是多卡部署OmniLmm12B时虽然像给出的指示设置了 device_map["model.embed_tokens"] = 0 device_map["model.layers.0"] = 0 device_map["model.layers.31"] = 0 device_map["model.norm"] = 0 device_map["model.resampler"] = 0 device_map["model.vision_tower"] = 0 device_map["lm_head"] = 0 保证输入输出在同一张显卡,但仍然给出数据不在一张显卡的报错。Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!
期望行为 | Expected Behavior
请问怎么解决这个问题?
复现方法 | Steps To Reproduce
No response
运行环境 | Environment
备注 | Anything else?
No response