openlm-research / open_llama

OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
Apache License 2.0
7.27k stars 370 forks source link

Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! #76

Closed smSafiHaider closed 12 months ago

smSafiHaider commented 12 months ago

I used the following code to load the model: `import torch from transformers import LlamaTokenizer, LlamaForCausalLM

device = torch.device('cuda')

model_path = 'openlm-research/open_llama_3b'

model_path = 'openlm-research/open_llama_7b'

model_path = 'openlm-research/open_llama_13b'

tokenizer = LlamaTokenizer.from_pretrained(model_path) model = LlamaForCausalLM.from_pretrained( model_path, torch_dtype=torch.float16, device_map='auto' )`

but when generating output it gives the following error: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)

Any leads on how to solve it

alvarobartt commented 12 months ago

Sure @smSafiHaider to solve that you will need to use the following code instead 👍🏻

import torch
from transformers import LlamaTokenizer, LlamaForCausalLM

tokenizer = LlamaTokenizer.from_pretrained("openlm-research/open_llama_7b_v2")
model = LlamaForCausalLM.from_pretrained("openlm-research/open_llama_7b_v2", torch_dtype=torch.float16, device_map="auto")

prompt = 'Q: What is the largest animal?\nA:'
input_ids = tokenizer(prompt, return_tensors="pt").input_ids
input_ids = input_ids.to(model.device)

with torch.cuda.amp.autocast():
    generation_output = model.generate(
        input_ids=input_ids, max_new_tokens=32
    )
    print(tokenizer.decode(generation_output[0]))

This way you make sure that the device where the model is due to device_map="auto" from :hugs:accelerate is the one you use to move the torch.Tensors before calling .generate. Additionally, make sure you install the following dependencies in advance pip install transformers einops accelerate sentencepiece

smSafiHaider commented 12 months ago

It worked thankyou!!😊