Open BigDataMLexplorer opened 2 weeks ago
No, this is a bug. Your page says this should work and it does not. Please do something about it.
I'm sorry, I misread! I didn't realize the code worked with other models and only failed with Falcon. Can you give us a minimal reproducer (code that we can copy-paste to cause the issue on our systems?) Also cc @muellerzr, since this seems like an accelerate
thing,.
@Rocketknight1 Hi, I gave the code I use in the last message. Just tokenize some of your data. The versions of the libraries I use are also in the last message. Thanks
@Rocketknight1 @muellerzr Please, do you have some idea, why this does not work for Falcon model but works for other models like Llama3, Nemo, Mistral, Phi3 and so on?
System Info
transformers
version: 4.44.0Who can help?
@ArthurZucker
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Hello, I tried one jupyter notebook on multiple model trainings. When i load the model, I use device_map = "auto" to split the model on multiple (4) GPUs. After that, I use the Trainer and it does parallel training automatically. It always works except for Falcon (7b and 11b). For other models, parallel training takes place automatically and all 4 graphics cards are connected.
What should I do please? I am posting part of my code and error message:
ERROR::
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:3 and cuda:0! (when checking argument for argument target in method wrapper_CUDA_nll_loss_forward)
Expected behavior
.