facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
MIT License
30.37k stars 6.4k forks source link

meet error when running LSTM #137

Closed StillKeepTry closed 6 years ago

StillKeepTry commented 6 years ago

I am trying to run the LSTM model. The command is CUDA_VISIBLE_DEVICES=0 python train.py data-bin/iwslt14.tokenized.de-en --optim adam --lr 0.0003125 --clip-norm 0.1 --dropout 0.2 --max-tokens 4000 --save-dir checkpoints/lstm/ --arch lstm_wiseman_iwslt_de_en, and meet error:

| [de] dictionary: 20111 types | [en] dictionary: 14619 types | data-bin/iwslt14.tokenized.de-en train 160215 examples | data-bin/iwslt14.tokenized.de-en valid 7282 examples | model lstm_wiseman_iwslt_de_en, criterion CrossEntropyCriterion | num. model params: 14159387 | training on 1 GPUs | max tokens per GPU = 4000 and max sentences per GPU = None | epoch 001: 0%| | 0/996 [00:00<?, ?it/s]Traceback (most recent call last): File "train.py", line 29, in main(args) File "train.py", line 23, in main singleprocess_main(args) File "/data/kaitao/workplaces/fairseq-py/singleprocess_train.py", line 79, in main train(args, trainer, dataset, epoch, batch_offset) File "/data/kaitao/workplaces/fairseq-py/singleprocess_train.py", line 138, in train log_output = trainer.train_step(sample) File "/data/kaitao/workplaces/fairseq-py/fairseq/trainer.py", line 94, in train_step loss, sample_sizes, logging_outputs, ooms_fwd = self._forward(sample) File "/data/kaitao/workplaces/fairseq-py/fairseq/trainer.py", line 152, in _forward raise e File "/data/kaitao/workplaces/fairseq-py/fairseq/trainer.py", line 142, in _forward loss, sample_size, loggingoutput = self.criterion(self.model, sample) File "/data/kaitao/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in call result = self.forward(input, kwargs) File "/data/kaitao/workplaces/fairseq-py/fairseq/criterions/cross_entropy.py", line 28, in forward net_output = model(sample['net_input']) File "/data/kaitao/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in call result = self.forward(input, *kwargs) File "/data/kaitao/workplaces/fairseq-py/fairseq/models/fairseq_model.py", line 43, in forward encoder_out = self.encoder(src_tokens, src_lengths) File "/data/kaitao/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in call result = self.forward(input, **kwargs) File "/data/kaitao/workplaces/fairseq-py/fairseq/models/lstm.py", line 103, in forward left_to_right=True, File "/data/kaitao/workplaces/fairseq-py/fairseq/utils.py", line 294, in convert_padding_direction if pad_mask.max() == 0: File "/data/kaitao/anaconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 125, in bool torch.typename(self.data) + " is ambiguous") RuntimeError: bool value of Variable objects containing non-empty torch.cuda.ByteTensor is ambiguous

The PyTorch version is 0.3.0, and i modify this part code as this. Would you like to give some better suggestions.

myleott commented 6 years ago

The latest version of fairseq requires PyTorch >= 0.4.0, which requires building PyTorch from source. Please follow the instructions here: https://github.com/pytorch/pytorch#from-source.

That error is because the semantics of max() changed in PyTorch at some point so that it returns a Tensor instead of a literal. You can add .item() after the .max() to get the old behavior:

>>> x = torch.rand(4, 4)
>>> x.max()

 0.9556
[torch.FloatTensor of size ()]

>>> x.max().item()
0.9556044340133667
>>>