Error when training at LSTM mode

monilouise commented 2 years ago

When I run the command:

run_bert_harem.py" --bert_model C:/bert-base-portuguese-cased --labels_file data/classes-selective.txt --do_train --train_file data/FirstHAREM-selective-train.json --valid_file data/FirstHAREM-selective-dev.json --freeze_bert --pooler sum --no_crf --num_train_epochs 100 --per_gpu_train_batch_size 2 --gradient_accumulation_steps 8 --do_eval --eval_file data/MiniHAREM-selective.json --output_dir output_bert-lstm_selective

the following error occurs:

Traceback (most recent call last): File "(...)\portuguese-bert\ner_evaluation\run_bert_harem.py", line 129, in main(load_and_cache_examples, File "(...)\portuguese-bert\ner_evaluation\trainer.py", line 737, in main train( File "(...)\portuguese-bert\ner_evaluation\trainer.py", line 242, in train outs = model(input_ids, segment_ids, File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "(...)\portuguese-bert\ner_evaluation\model.py", line 483, in forward lstm_out = self.forward_lstm( File "(...)\portuguese-bert\ner_evaluation\model.py", line 439, in forward_lstm packed_sequence, sorted_ixs = self._pack_bert_encoded_sequence( File "(...)\portuguese-bert\ner_evaluation\model.py", line 403, in _pack_bert_encoded_sequence packed_sequence = torch.nn.utils.rnn.pack_padded_sequence( File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\utils\rnn.py", line 249, in pack_padded_sequence _VF._pack_padded_sequence(input, lengths, batch_first) RuntimeError: 'lengths' argument should be a 1D CPU int64 tensor, but got 1D cuda:0 Long tensor

What could be wrong? Why can't the training occur in a GPU environment?

fabiocapsouza commented 2 years ago

Hi @monilouise, which version of Pytorch are you using? If you are not using 1.1.0 (version the code was written for), the function pack_padded_sequence inputs were changed to accept CPU tensor for the length argument. Add a .to('cpu') on forward_lstm method and it will probably work.

monilouise commented 2 years ago

Hi @fabiocapsouza , my Pytorch version is 1.11.0.

I'll try your suggestion, thanks.

fabiocapsouza commented 2 years ago

@monilouise Have you tried the suggestion? If you have, please let us know if it worked :)

neuralmind-ai / portuguese-bert

Error when training at LSTM mode #41