Exception in device=TPU:7: Unknown device
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_gpt2.py", line 484, in forward
hidden_states, layer_past=layer_past, attention_mask=attention_mask, head_mask=head_mask[i]
File "/usr/local/lib/python3.6/dist-packages/torch_xla/distributed/xla_multiprocessing.py", line 119, in _start_fn
fn(gindex, *args)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "<ipython-input-17-4d2a1ccbaa5f>", line 3, in _mp_fn
a = run(trn_df, val_df, model, tokenizer, args)
File "<ipython-input-16-d526a8f464d8>", line 81, in run
scheduler
File "<ipython-input-11-be3e41b46d25>", line 10, in train_fn
outputs = model(inputs, labels = labels)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_gpt2.py", line 599, in forward
inputs_embeds=inputs_embeds,
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_gpt2.py", line 484, in forward
hidden_states, layer_past=layer_past, attention_mask=attention_mask, head_mask=head_mask[i]
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_gpt2.py", line 231, in forward
m = self.mlp(self.ln_2(x))
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_gpt2.py", line 210, in forward
h = self.act(self.c_fc(x))
RuntimeError: Unknown device
Expected behavior
GPT2 model to start training leveraging the Google Colab TPU
🐛 Bug
Information
Model I am using (Bert, XLNet ...): DialoGPT2-small from microsoft
Language I am using the model on (English, Chinese ...): Spanish Conversations
The problem arises when using: Pytorch's XLA library for trying to train a GPT2 model on google colab TPUS
The tasks I am working on is:
To reproduce
Steps to reproduce the behavior:
Expected behavior
GPT2 model to start training leveraging the Google Colab TPU
Environment info
transformers
version: 2.8.0Any help is greatly appreciative and thanks for the amazing library!