McGill-NLP / llm2vec

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'
https://mcgill-nlp.github.io/llm2vec/
MIT License
1.17k stars 88 forks source link

Issue when loading model on multiple gpus #69

Closed 614TChen closed 4 months ago

614TChen commented 4 months ago

Hi, I encountered an issue when running run_mntp.py with the LLaMA 3 model loaded on multiple GPUs. The script fails to execute properly, and I haven't found any clues to resolve it yet. Has anyone encountered the same error before, or does anyone have suggestions on how to troubleshoot this?

File "/media/users/miniconda3/envs/lmv4/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 757, in forward hidden_states = residual + hidden_states RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!

vaibhavad commented 4 months ago

I was able to successfully run Llama3 MNTP training on 2 GPUs. Here is the command I used

torchrun --nproc_per_node=2 experiments/run_mntp.py train_configs/mntp/MetaLlama3.json

Can you share what command you are using?

614TChen commented 4 months ago

Thx. The command you provide works as charm on my side as well. I think the problem should be i cannot simply add device_map="auto" when loading the model.