RuntimeError: Subtraction, the `-` operator, with a bool tensor is not supported. If you are trying to invert a mask, use the `~` or `logical_not()` operator instead.

Maaaj commented 2 years ago

Hello, Please guide me through this issue, I am unable to train the sequencer-train.sh, due to following ERRORS.

mj@ubuntu:~/Desktop/AEC/chai-master/src $ ./sequencer-train.sh sequencer-train.sh start Starting data preprocessing Please backup existing pt files: /home/mj/Desktop/AEC/chai-master/results/Golden/final.train.pt, to avoid overwriting them! Starting training [2021-11-17 20:34:22,616 INFO] vocabulary size. source = 1004; target = 1004 [2021-11-17 20:34:22,616 INFO] Building model... [2021-11-17 20:34:22,645 INFO] NMTModel( (encoder): RNNEncoder( (embeddings): Embeddings( (make_embedding): Sequential( (emb_luts): Elementwise( (0): Embedding(1004, 256, padding_idx=1) ) ) ) (rnn): LSTM(256, 128, num_layers=2, dropout=0.3, bidirectional=True) (bridge): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) ) ) (decoder): InputFeedRNNDecoder( (embeddings): Embeddings( (make_embedding): Sequential( (emb_luts): Elementwise( (0): Embedding(1004, 256, padding_idx=1) ) ) ) (dropout): Dropout(p=0.3, inplace=False) (rnn): StackedLSTM( (dropout): Dropout(p=0.3, inplace=False) (layers): ModuleList( (0): LSTMCell(512, 256) (1): LSTMCell(256, 256) ) ) (attn): GlobalAttention( (linear_in): Linear(in_features=256, out_features=256, bias=False) (linear_out): Linear(in_features=512, out_features=256, bias=False) ) ) (generator): CopyGenerator( (linear): Linear(in_features=256, out_features=1004, bias=True) (linear_copy): Linear(in_features=256, out_features=1, bias=True) ) ) [2021-11-17 20:34:22,645 INFO] encoder: 1179136 [2021-11-17 20:34:22,645 INFO] decoder: 2026733 [2021-11-17 20:34:22,645 INFO] number of parameters: 3205869 [2021-11-17 20:34:22,646 INFO] Starting training on CPU, could be very slow [2021-11-17 20:34:22,646 INFO] Start training... [2021-11-17 20:34:32,331 INFO] Loading dataset from /home/mj/Desktop/AEC/chai-master/results/Golden/final.train.0.pt, number of examples: 33469 /home/mj/.local/lib/python3.8/site-packages/torchtext/data/field.py:359: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). var = torch.tensor(arr, dtype=self.dtype, device=device) Traceback (most recent call last): File "train.py", line 120, in main(opt) File "train.py", line 53, in main single_main(opt, -1) File "/home/mj/Desktop/AEC/chai-master/src/lib/OpenNMT-py/onmt/train_single.py", line 154, in main trainer.train(train_iter, valid_iter, opt.train_steps, opt.valid_steps) File "/home/mj/Desktop/AEC/chai-master/src/lib/OpenNMT-py/onmt/trainer.py", line 172, in train self._gradient_accumulation( File "/home/mj/Desktop/AEC/chai-master/src/lib/OpenNMT-py/onmt/trainer.py", line 280, in _gradient_accumulation self.model(src, tgt, src_lengths) File "/home/mj/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(input, kwargs) File "/home/mj/Desktop/AEC/chai-master/src/lib/OpenNMT-py/onmt/models/model.py", line 45, in forward dec_out, attns = self.decoder(tgt, memory_bank, File "/home/mj/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, *kwargs) File "/home/mj/Desktop/AEC/chai-master/src/lib/OpenNMT-py/onmt/decoders/decoder.py", line 160, in forward dec_state, dec_outs, attns = self._run_forward_pass( File "/home/mj/Desktop/AEC/chai-master/src/lib/OpenNMT-py/onmt/decoders/decoder.py", line 336, in _run_forward_pass decoder_output, p_attn = self.attn( File "/home/mj/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(input, kwargs) File "/home/mj/Desktop/AEC/chai-master/src/lib/OpenNMT-py/onmt/modules/global_attention.py", line 183, in forward align.maskedfill(1 - mask, -float('inf')) File "/home/mj/.local/lib/python3.8/site-packages/torch/_tensor.py", line 30, in wrapped return f(*args, **kwargs) File "/home/mj/.local/lib/python3.8/site-packages/torch/_tensor.py", line 548, in rsub return _C._VariableFunctions.rsub(self, other) RuntimeError: Subtraction, the - operator, with a bool tensor is not supported. If you are trying to invert a mask, use the ~ or logical_not() operator instead. sequencer-train.sh done

Maaaj commented 2 years ago

I am having this issue while I am training. UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). var = torch.tensor(arr, dtype=self.dtype, device=device)

and another

RuntimeError: Subtraction, the - operator, with a bool tensor is not supported. If you are trying to invert a mask, use the ~ or logical_not() operator instead

hungkien05 commented 2 years ago

Same problem over here, when I was trying to run the predict script. Edited: I got this error sinceI didn't use the git-lfs when cloning the whole repo. I only used git-lfs to get the model.pt file.

Maaaj commented 2 years ago

Same problem over here, when I was trying to run the predict script. Edited: I got this error sinceI didn't use the git-lfs when cloning the whole repo. I only used git-lfs to get the model.pt file.

Can you please share the command for getting the clone using git-lfs for whole repo. I used git-lfs clone https://github.com/KTH/chai.git, but I am still getting the same error while training

chenzimin commented 2 years ago

Thanks all for the report, I will look into this issue.

hungkien05 commented 2 years ago

Same problem over here, when I was trying to run the predict script. Edited: I got this error sinceI didn't use the git-lfs when cloning the whole repo. I only used git-lfs to get the model.pt file.

Can you please share the command for getting the clone using git-lfs for whole repo. I used git-lfs clone https://github.com/KTH/chai.git, but I am still getting the same error while training

You need to install git-lfs first. It's the same as normal git command, only change "git" into "git-lfs". Here is my commands: git-lfs clone https://github.com/KTH/chai.git

SamraMehboob commented 2 years ago

I am also facing same error. If anyone managed to solve it, please share. Thanks.

SamraMehboob commented 2 years ago

I resolved this issue. You need to edit /chai-master/src/lib/OpenNMT-py/onmt/modules/global_attention.py.

change line 183 to align.maskedfill(~mask, -float('inf')) Thanks.

chenzimin commented 2 years ago

This indeed seems like a PyTorch version issue, one way to solve it is to change 1 - BOOL to ~BOOL, another way is to downgrade the torch version to 1.1.0

XiaoyaWang-gh commented 2 years ago

find the wrong file and change "1-mask" to "~mask".

ASSERT-KTH / sequencer

RuntimeError: Subtraction, the `-` operator, with a bool tensor is not supported. If you are trying to invert a mask, use the `~` or `logical_not()` operator instead. #33