FlagOpen / FlagEmbedding

Retrieval and Retrieval-augmented LLMs
MIT License
6.54k stars 466 forks source link

Unable to load on multiple GPUs using HuggingFace Transformers #771

Open mohammad-yousuf opened 3 months ago

mohammad-yousuf commented 3 months ago

When I try to load on multiple GPUS, I get the following error:

tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-reranker-base') model = AutoModelForSequenceClassification.from_pretrained('BAAI/bge-reranker-base', device_map='auto')

error: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)

staoxiao commented 3 months ago

Hi, @mohammad-yousuf , you need to move the data to the GPU device before passing it to the model:

device = torch.device('cuda')
encoded_input = encoded_input.to(device)