Failed to run `demo/inference.py` on multiple GPUs with RuntimeError: Expected all tensors to be on the same device

I successfully ran demo/inference.py on the CPU, but it responds slowly. Due to limited memory on the 3090 GPU, I attempted to run it on two GPUs. However, I meet an error in Chat.answer(), indicating: "RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!". Screenshot of the error: Screenshot 2024-06-15 at 00 20 40 And I also type the device map of the model: I am unsure why this error occurs. I've tried to fix it all day. Any insights or solutions would be greatly appreciated.

EmbodiedGPT / EmbodiedGPT_Pytorch

Failed to run `demo/inference.py` on multiple GPUs with RuntimeError: Expected all tensors to be on the same device #7