Open kolaSamuel opened 5 years ago
Hi,
I'm having the same issue as above, is there any documentation or tutorials on how to run this implementation of openNMT with BERT?
thank you Brian
Hello Brian,
Sadly there are none I found, please let me know if you find anything helpful.
Best Wishes, Samuel
From: Ovis85 Sent: Monday, October 14, 2019 4:12 AM To: shakeel608/OpenNMT-py-with-BERT Cc: SamuelKola; Author Subject: Re: [shakeel608/OpenNMT-py-with-BERT] Error training model (#1)
Hi, I'm having the same issue as above, is there any documentation or tutorials on how to run this implementation of openNMT with BERT? thank you Brian — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
HI KolaSamuel,
I was able to fix the issue of "RuntimeError: Expected object of backend CUDA but got backend CPU for argument #3 'index'" on google collab, by selecting runtime as type as GPU. I hope that helps.
I run into a new issue now
" File "/content/gdrive/My Drive/OpenNMT-py-with-BERT/onmt/modules/utilclass.py", line 25, in forward assert len(self) == len(inputs) AssertionError"
you mentioned that you got the code to work but only in CPU mode, did you encounter the error that im seeing? and how did you fix this?
thanks
Hi @Ovis85,
Have you been able to solve this issue. Coz i am facing the issue now.
Thanks.
I'm attempting to train the model using this command after preprocessing, as I have been successfully doing on the default OpenNMT-py. I am using the same command:
main(opt)
File "./train.py", line 41, in main
single_main(opt, -1)
File "/content/OpenNMT-py-with-BERT/onmt/train_single.py", line 116, in main
valid_steps=opt.valid_steps)
File "/content/OpenNMT-py-with-BERT/onmt/trainer.py", line 192, in train
self._accum_batches(train_iter)):
File "/content/OpenNMT-py-with-BERT/onmt/trainer.py", line 127, in _accum_batches
for batch in iterator:
File "/content/OpenNMT-py-with-BERT/onmt/inputters/inputter.py", line 598, in iter
for batch in self._iter_dataset(path):
File "/content/OpenNMT-py-with-BERT/onmt/inputters/inputter.py", line 583, in _iter_dataset
for batch in cur_iter:
File "/usr/local/lib/python3.6/dist-packages/torchtext/data/iterator.py", line 156, in iter
yield Batch(minibatch, self.dataset, self.device)
File "/usr/local/lib/python3.6/dist-packages/torchtext/data/batch.py", line 34, in init
setattr(self, name, field.process(batch, device=device))
File "/content/OpenNMT-py-with-BERT/onmt/inputters/text_dataset.py", line 297, in process
bert_embeddings = self.bertify(tensor)
File "/content/OpenNMT-py-with-BERT/onmt/inputters/text_dataset.py", line 235, in bertify
bert_embeddings = bert_model(tensor, output_all_encoded_layers = False)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, kwargs)
File "/usr/local/lib/python3.6/dist-packages/pytorch_pretrained_bert/modeling.py", line 730, in forward
embedding_output = self.embeddings(input_ids, token_type_ids)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, *kwargs)
File "/usr/local/lib/python3.6/dist-packages/pytorch_pretrained_bert/modeling.py", line 267, in forward
words_embeddings = self.word_embeddings(input_ids)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(input, kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/sparse.py", line 117, in forward
self.norm_type, self.scale_grad_by_freq, self.sparse)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 1506, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected object of backend CUDA but got backend CPU for argument #3 'index'
!python -u ./train.py \ -data preprocessed \ -save_model CQR_valid_100\ -train_steps 10000 \ -rnn_size 128 \ -word_vec_size 128 \ -encoder_type rnn \ -decoder_type rnn \ -optim adagrad \ -learning_rate 0.15 \ -share_embeddings \ -valid_steps 100 \ -save_checkpoint_steps 1000\ -log_file log.txt
but i keep getting this error: Traceback (most recent call last): File "./train.py", line 109, inI can fix the error by forcing the device in the text_dataset.py file to always be 'CPU' but this would make training very slow, and it already takes hours train even on the 'GPU'. please help