Open Maaaj opened 2 years ago
I am having this issue while I am training. UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). var = torch.tensor(arr, dtype=self.dtype, device=device)
and another
RuntimeError: Subtraction, the - operator, with a bool tensor is not supported. If you are trying to invert a mask, use the ~ or logical_not() operator instead
Same problem over here, when I was trying to run the predict script. Edited: I got this error sinceI didn't use the git-lfs when cloning the whole repo. I only used git-lfs to get the model.pt file.
Same problem over here, when I was trying to run the predict script. Edited: I got this error sinceI didn't use the git-lfs when cloning the whole repo. I only used git-lfs to get the model.pt file.
Can you please share the command for getting the clone using git-lfs for whole repo. I used git-lfs clone https://github.com/KTH/chai.git, but I am still getting the same error while training
Thanks all for the report, I will look into this issue.
Same problem over here, when I was trying to run the predict script. Edited: I got this error sinceI didn't use the git-lfs when cloning the whole repo. I only used git-lfs to get the model.pt file.
Can you please share the command for getting the clone using git-lfs for whole repo. I used git-lfs clone https://github.com/KTH/chai.git, but I am still getting the same error while training
You need to install git-lfs first. It's the same as normal git command, only change "git" into "git-lfs". Here is my commands:
git-lfs clone https://github.com/KTH/chai.git
I am also facing same error. If anyone managed to solve it, please share. Thanks.
I resolved this issue. You need to edit /chai-master/src/lib/OpenNMT-py/onmt/modules/global_attention.py.
change line 183 to align.maskedfill(~mask, -float('inf')) Thanks.
This indeed seems like a PyTorch version issue, one way to solve it is to change 1 - BOOL
to ~BOOL
, another way is to downgrade the torch version to 1.1.0
find the wrong file and change "1-mask" to "~mask".
Hello, Please guide me through this issue, I am unable to train the sequencer-train.sh, due to following ERRORS.
mj@ubuntu:~/Desktop/AEC/chai-master/src $ ./sequencer-train.sh sequencer-train.sh start Starting data preprocessing Please backup existing pt files: /home/mj/Desktop/AEC/chai-master/results/Golden/final.train.pt, to avoid overwriting them! Starting training [2021-11-17 20:34:22,616 INFO] vocabulary size. source = 1004; target = 1004 [2021-11-17 20:34:22,616 INFO] Building model... [2021-11-17 20:34:22,645 INFO] NMTModel( (encoder): RNNEncoder( (embeddings): Embeddings( (make_embedding): Sequential( (emb_luts): Elementwise( (0): Embedding(1004, 256, padding_idx=1) ) ) ) (rnn): LSTM(256, 128, num_layers=2, dropout=0.3, bidirectional=True) (bridge): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) ) ) (decoder): InputFeedRNNDecoder( (embeddings): Embeddings( (make_embedding): Sequential( (emb_luts): Elementwise( (0): Embedding(1004, 256, padding_idx=1) ) ) ) (dropout): Dropout(p=0.3, inplace=False) (rnn): StackedLSTM( (dropout): Dropout(p=0.3, inplace=False) (layers): ModuleList( (0): LSTMCell(512, 256) (1): LSTMCell(256, 256) ) ) (attn): GlobalAttention( (linear_in): Linear(in_features=256, out_features=256, bias=False) (linear_out): Linear(in_features=512, out_features=256, bias=False) ) ) (generator): CopyGenerator( (linear): Linear(in_features=256, out_features=1004, bias=True) (linear_copy): Linear(in_features=256, out_features=1, bias=True) ) ) [2021-11-17 20:34:22,645 INFO] encoder: 1179136 [2021-11-17 20:34:22,645 INFO] decoder: 2026733 [2021-11-17 20:34:22,645 INFO] number of parameters: 3205869 [2021-11-17 20:34:22,646 INFO] Starting training on CPU, could be very slow [2021-11-17 20:34:22,646 INFO] Start training... [2021-11-17 20:34:32,331 INFO] Loading dataset from /home/mj/Desktop/AEC/chai-master/results/Golden/final.train.0.pt, number of examples: 33469 /home/mj/.local/lib/python3.8/site-packages/torchtext/data/field.py:359: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). var = torch.tensor(arr, dtype=self.dtype, device=device) Traceback (most recent call last): File "train.py", line 120, in
main(opt)
File "train.py", line 53, in main
single_main(opt, -1)
File "/home/mj/Desktop/AEC/chai-master/src/lib/OpenNMT-py/onmt/train_single.py", line 154, in main
trainer.train(train_iter, valid_iter, opt.train_steps, opt.valid_steps)
File "/home/mj/Desktop/AEC/chai-master/src/lib/OpenNMT-py/onmt/trainer.py", line 172, in train
self._gradient_accumulation(
File "/home/mj/Desktop/AEC/chai-master/src/lib/OpenNMT-py/onmt/trainer.py", line 280, in _gradient_accumulation
self.model(src, tgt, src_lengths)
File "/home/mj/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call( input, kwargs)
File "/home/mj/Desktop/AEC/chai-master/src/lib/OpenNMT-py/onmt/models/model.py", line 45, in forward
dec_out, attns = self.decoder(tgt, memory_bank,
File "/home/mj/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, *kwargs)
File "/home/mj/Desktop/AEC/chai-master/src/lib/OpenNMT-py/onmt/decoders/decoder.py", line 160, in forward
dec_state, dec_outs, attns = self._run_forward_pass(
File "/home/mj/Desktop/AEC/chai-master/src/lib/OpenNMT-py/onmt/decoders/decoder.py", line 336, in _run_forward_pass
decoder_output, p_attn = self.attn(
File "/home/mj/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(input, kwargs)
File "/home/mj/Desktop/AEC/chai-master/src/lib/OpenNMT-py/onmt/modules/global_attention.py", line 183, in forward
align.maskedfill(1 - mask, -float('inf'))
File "/home/mj/.local/lib/python3.8/site-packages/torch/_tensor.py", line 30, in wrapped
return f(*args, **kwargs)
File "/home/mj/.local/lib/python3.8/site-packages/torch/_tensor.py", line 548, in rsub
return _C._VariableFunctions.rsub(self, other)
RuntimeError: Subtraction, the
-
operator, with a bool tensor is not supported. If you are trying to invert a mask, use the~
orlogical_not()
operator instead. sequencer-train.sh done