SamLynnEvans / Transformer

Transformer seq2seq model, program that can build a language translator from parallel corpus
Apache License 2.0
1.35k stars 350 forks source link

runtime error #8

Open xiaohongniua opened 5 years ago

xiaohongniua commented 5 years ago

the whole error is as follows:, x = x + pe got different size?

creating dataset and iterator... model weights will be saved every 20 minutes and at end of epoch to directory weights/ training model... Traceback (most recent call last): File "train.py", line 192, in main() File "train.py", line 120, in main train_model(model, opt) File "train.py", line 37, in train_model preds = model(src, trg_input, src_mask, trg_mask) File "/home/tensorflow/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, kwargs) File "/home/tensorflow/reaction/Transformer/Models.py", line 50, in forward d_output = self.decoder(trg, e_outputs, src_mask, trg_mask) File "/home/tensorflow/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, *kwargs) File "/home/tensorflow/reaction/Transformer/Models.py", line 36, in forward x = self.pe(x) File "/home/tensorflow/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(input, kwargs) File "/home/tensorflow/reaction/Transformer/Embed.py", line 40, in forward x = x + pe RuntimeError: The size of tensor a (230) must match the size of tensor b (200) at non-singleton dimension 1

CharizardAcademy commented 5 years ago

same problem here, can't figure out why this is happening, but for me this happens during the inference. When the decoding time step is greater than 512, then this issue jumps out

CharizardAcademy commented 5 years ago

I‘m testing the model on customerized dataset, and during the inference, this weird issue just keep showing up translating the 0th sentence done. translating the 1th sentence done. translating the 2th sentence done. translating the 3th sentence done. translating the 4th sentence done. translating the 5th sentence done. translating the 6th sentence done. translating the 7th sentence done. translating the 8th sentence done. translating the 9th sentence done. translating the 10th sentence done. translating the 11th sentence done. translating the 12th sentence done. translating the 13th sentence done. translating the 14th sentence done. translating the 15th sentence done. translating the 16th sentence done. translating the 17th sentence done. translating the 18th sentence done. translating the 19th sentence done. translating the 20th sentence done. translating the 21th sentence done. translating the 22th sentence done. translating the 23th sentence done. translating the 24th sentence done. translating the 25th sentence done. translating the 26th sentence done. translating the 27th sentence done. translating the 28th sentence done. translating the 29th sentence done. translating the 30th sentence done. translating the 31th sentence done. translating the 32th sentence done. translating the 33th sentence done. translating the 34th sentence done. translating the 35th sentence done. translating the 36th sentence done. translating the 37th sentence done. translating the 38th sentence done. translating the 39th sentence done. translating the 40th sentence done. translating the 41th sentence done. translating the 42th sentence done. Traceback (most recent call last): File "translate.py", line 95, in <module> main() File "translate.py", line 85, in main translated_ref_sentence = translate(opt, sentence, model, SRC, REF) File "translate.py", line 39, in translate translated.append(translate_sentence(sentence, model, opt, SRC, REF).capitalize()) File "translate.py", line 29, in translate_sentence sentence = beam_search(sentence, model, SRC, REF, opt) File "/home/yingqiang/workspace/Archi-max-pool/BeamSearch.py", line 66, in beam_search out = model.to_vocab(model.decoder(ref=outputs[:,:i], enc_outputs=e_output, ref_mask=ref_mask)) File "/home/yingqiang/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__ result = self.forward(*input, **kwargs) File "/home/yingqiang/workspace/Archi-max-pool/Model.py", line 87, in forward x = self.pe(x) File "/home/yingqiang/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__ result = self.forward(*input, **kwargs) File "/home/yingqiang/workspace/Archi-max-pool/Embedding.py", line 45, in forward x = x + pe RuntimeError: The size of tensor a (513) must match the size of tensor b (512) at non-singleton dimension 1 Any ideas on how to solve this problem?

xiaohongniua commented 5 years ago

It might be the problem of params here. Set max_seq_len a larger value is ok for me!

class PositionalEncoder(nn.Module): def init(self, d_model, max_seq_len = 200, dropout = 0.1): super().init() self.d_model = d_model self.dropout = nn.Dropout(dropout)

create constant 'pe' matrix with values dependant on

    # pos and i
    pe = torch.zeros(max_seq_len, d_model)
    for pos in range(max_seq_len):
        for i in range(0, d_model, 2):
            pe[pos, i] = \
            math.sin(pos / (10000 ** ((2 * i)/d_model)))
            pe[pos, i + 1] = \
            math.cos(pos / (10000 ** ((2 * (i + 1))/d_model)))
    pe = pe.unsqueeze(0)