microsoft / DeepSpeedExamples

Example models using DeepSpeed
Apache License 2.0
6.02k stars 1.02k forks source link

RuntimeError with BLOOMZ: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! #598

Open karim1104 opened 1 year ago

karim1104 commented 1 year ago

I'm facing the above error in both stage 1 and stage 2 when using BLOOMZ 3B and 560M. I tried adding "model.to(device)" and "model.to('cuda')" to main.py but neither worked. The error only appears when I switch from Llama to BLOOMZ.

XuJing1022 commented 10 months ago

hi, whether you have figured out the reason?