Train & Test execution error

ai-motive commented 3 years ago

When I try train & test command I get an error as follow.

My python and torch version : 3.8 / 1.7.0+cu110

Why these errors occur?

Train command : python src/train.py --formulas data/20K/formulas.txt --train data/20K/train.txt --val data/20K/val.txt --vocab data/20K/latex_vocab.txt

error message

Traceback (most recent call last):
  File "src/train.py", line 461, in <module>
    out = model.forward(sourceDataTmpArray, sourcePositionTmpArray, tgt_teachingArray,
  File "/home/motive/PycharmProjects/EDSL/src/model/transformers.py", line 33, in forward
    decode_embeded = self.decode(memory, src_mask, tgt, tgt_mask)
  File "/home/motive/PycharmProjects/EDSL/src/model/transformers.py", line 45, in decode
    return self.decoder(tgt_embed, memory, src_mask, tgt_mask)
  File "/home/motive/anaconda3/envs/EDSL/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/motive/PycharmProjects/EDSL/src/model/transformers.py", line 149, in forward
    x = layer(x, memory, src_mask, tgt_mask)
  File "/home/motive/anaconda3/envs/EDSL/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/motive/PycharmProjects/EDSL/src/model/transformers.py", line 173, in forward
    x = self.sublayer[1](x, lambda x: self.src_attn(x, m, m, src_mask))
  File "/home/motive/anaconda3/envs/EDSL/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/motive/PycharmProjects/EDSL/src/model/transformers.py", line 102, in forward
    return x + self.dropout(sublayer(self.norm(x)))
  File "/home/motive/PycharmProjects/EDSL/src/model/transformers.py", line 173, in <lambda>
    x = self.sublayer[1](x, lambda x: self.src_attn(x, m, m, src_mask))
  File "/home/motive/anaconda3/envs/EDSL/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/motive/PycharmProjects/EDSL/src/model/transformers.py", line 227, in forward
    [l(x).view(nbatches, -1, self.h, self.d_k).transpose(1, 2)
  File "/home/motive/PycharmProjects/EDSL/src/model/transformers.py", line 227, in <listcomp>
    [l(x).view(nbatches, -1, self.h, self.d_k).transpose(1, 2)
  File "/home/motive/anaconda3/envs/EDSL/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/motive/anaconda3/envs/EDSL/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 93, in forward
    return F.linear(input, self.weight, self.bias)
  File "/home/motive/anaconda3/envs/EDSL/lib/python3.8/site-packages/torch/nn/functional.py", line 1688, in linear
    if input.dim() == 2 and bias is not None:
AttributeError: 'tuple' object has no attribute 'dim'

Test command : python src/test.py --formulas data/20K/formulas.txt --test data/20K/test.txt --vocab data/20K/latex_vocab.txt

Traceback (most recent call last):
  File "src/test.py", line 321, in <module>
    model.load_state_dict(torch.load(root_path + '/data/model/model.pkl'))
  File "/home/motive/anaconda3/envs/EDSL/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for EncoderDecoder:
    size mismatch for tgt_embed.lut.weight: copying a param with shape torch.Size([301, 256]) from checkpoint, the shape in current model is torch.Size([195, 256]).
    size mismatch for generator.proj.weight: copying a param with shape torch.Size([301, 256]) from checkpoint, the shape in current model is torch.Size([195, 256]).
    size mismatch for generator.proj.bias: copying a param with shape torch.Size([301]) from checkpoint, the shape in current model is torch.Size([195]).

sadik1111 commented 3 years ago

After many modifications, I also encountered the same problem, which should be due to the problem of the provided model

ai-motive commented 3 years ago

@sadik1111 Have you succeeded in running the train and test code? Can you share your EDSL repo?

sadik1111 commented 3 years ago

@sadik1111 Have you succeeded in running the train and test code? Can you share your EDSL repo?

I didn't run it either

LiuyangRiver commented 2 years ago

Is it running now? thanks. I did not run it either.

error msg: IndexError: index 1652 is out of bounds for axis 0 with size 0

abcAnonymous commented 2 years ago

When I try train & test command I get an error as follow.

My python and torch version : 3.8 / 1.7.0+cu110

Why these errors occur?

Train command : python src/train.py --formulas data/20K/formulas.txt --train data/20K/train.txt --val data/20K/val.txt --vocab data/20K/latex_vocab.txt

error message

Traceback (most recent call last):
  File "src/train.py", line 461, in <module>
    out = model.forward(sourceDataTmpArray, sourcePositionTmpArray, tgt_teachingArray,
  File "/home/motive/PycharmProjects/EDSL/src/model/transformers.py", line 33, in forward
    decode_embeded = self.decode(memory, src_mask, tgt, tgt_mask)
  File "/home/motive/PycharmProjects/EDSL/src/model/transformers.py", line 45, in decode
    return self.decoder(tgt_embed, memory, src_mask, tgt_mask)
  File "/home/motive/anaconda3/envs/EDSL/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/motive/PycharmProjects/EDSL/src/model/transformers.py", line 149, in forward
    x = layer(x, memory, src_mask, tgt_mask)
  File "/home/motive/anaconda3/envs/EDSL/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/motive/PycharmProjects/EDSL/src/model/transformers.py", line 173, in forward
    x = self.sublayer[1](x, lambda x: self.src_attn(x, m, m, src_mask))
  File "/home/motive/anaconda3/envs/EDSL/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/motive/PycharmProjects/EDSL/src/model/transformers.py", line 102, in forward
    return x + self.dropout(sublayer(self.norm(x)))
  File "/home/motive/PycharmProjects/EDSL/src/model/transformers.py", line 173, in <lambda>
    x = self.sublayer[1](x, lambda x: self.src_attn(x, m, m, src_mask))
  File "/home/motive/anaconda3/envs/EDSL/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/motive/PycharmProjects/EDSL/src/model/transformers.py", line 227, in forward
    [l(x).view(nbatches, -1, self.h, self.d_k).transpose(1, 2)
  File "/home/motive/PycharmProjects/EDSL/src/model/transformers.py", line 227, in <listcomp>
    [l(x).view(nbatches, -1, self.h, self.d_k).transpose(1, 2)
  File "/home/motive/anaconda3/envs/EDSL/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/motive/anaconda3/envs/EDSL/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 93, in forward
    return F.linear(input, self.weight, self.bias)
  File "/home/motive/anaconda3/envs/EDSL/lib/python3.8/site-packages/torch/nn/functional.py", line 1688, in linear
    if input.dim() == 2 and bias is not None:
AttributeError: 'tuple' object has no attribute 'dim'

Test command : python src/test.py --formulas data/20K/formulas.txt --test data/20K/test.txt --vocab data/20K/latex_vocab.txt

Traceback (most recent call last):
  File "src/test.py", line 321, in <module>
    model.load_state_dict(torch.load(root_path + '/data/model/model.pkl'))
  File "/home/motive/anaconda3/envs/EDSL/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for EncoderDecoder:
  size mismatch for tgt_embed.lut.weight: copying a param with shape torch.Size([301, 256]) from checkpoint, the shape in current model is torch.Size([195, 256]).
  size mismatch for generator.proj.weight: copying a param with shape torch.Size([301, 256]) from checkpoint, the shape in current model is torch.Size([195, 256]).
  size mismatch for generator.proj.bias: copying a param with shape torch.Size([301]) from checkpoint, the shape in current model is torch.Size([195]).

After checking, the error occurs at line 86 in src/model/transformers.py.

85        # return self.norm(x)
86        return self.norm(x), attn

The annotated code at line 85 is correct, not line 86. I have fixed the error code.

abcAnonymous / EDSL

Train & Test execution error #2