McGill-NLP / llm2vec

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'
https://mcgill-nlp.github.io/llm2vec/
MIT License
1.17k stars 88 forks source link

How to use multiple GPUs #94

Closed motefly closed 3 months ago

motefly commented 3 months ago

I tried to run: CUDA_VISIBLE_DEVICES=0,1,2,3 python experiments/run_simcse.py train_configs/simcse/MetaLlama3.json and set the device_map='auto' in LLM2Vec.from_pretrained function, But, found: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!

Does this repo not support Multi-GPU training yet?

motefly commented 3 months ago

It support data parallelism but not model parallelism, closed