I am trying to use this new feature, but I have issues related to CPU/GPU tensor sharing.
The following code:
from transformers import (
AdamW,
MT5ForConditionalGeneration,
MT5Tokenizer,
get_linear_schedule_with_warmup
)
model = MT5ForConditionalGeneration.from_pretrained("google/mt5-xl")
tokenizer = MT5Tokenizer.from_pretrained("google/mt5-xl")
device_map = {
0: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
1: [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23],
}
model.parallelize(device_map)
input_ids = tokenizer("Studies have been shown that owning a dog is good for you", return_tensors="pt").input_ids
outputs = model.generate(input_ids, max_length=50).to('cpu')
print("Output:\n" + 100 * '-')
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Gives the following error:
Traceback (most recent call last):
File "/net/scratch/people/plgapohl/poleval-2021/task-4/polevel-2021-task-4/model-parallel/load_model.py", line 28, in <module>
outputs = model.generate(input_ids, max_length=50).to('cpu')
File "/net/people/plgapohl/python_3.9-mt5/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/net/people/plgapohl/python_3.9-mt5/lib/python3.9/site-packages/transformers/generation_utils.py", line 922, in generate
model_kwargs = self._prepare_encoder_decoder_kwargs_for_generation(input_ids, model_kwargs)
File "/net/people/plgapohl/python_3.9-mt5/lib/python3.9/site-packages/transformers/generation_utils.py", line 417, in _prepare_encoder_decoder_kwargs_for_generation
model_kwargs["encoder_outputs"]: ModelOutput = encoder(input_ids, return_dict=True, **encoder_kwargs)
File "/net/people/plgapohl/python_3.9-mt5/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/net/people/plgapohl/python_3.9-mt5/lib/python3.9/site-packages/transformers/models/t5/modeling_t5.py", line 898, in forward
inputs_embeds = self.embed_tokens(input_ids)
File "/net/people/plgapohl/python_3.9-mt5/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/net/people/plgapohl/python_3.9-mt5/lib/python3.9/site-packages/torch/nn/modules/sparse.py", line 158, in forward
return F.embedding(
File "/net/people/plgapohl/python_3.9-mt5/lib/python3.9/site-packages/torch/nn/functional.py", line 2043, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking arugment for argument index in method wrapper_index_select)
I am trying to use this new feature, but I have issues related to CPU/GPU tensor sharing. The following code:
Gives the following error: