# Padding Should be Zero
src_vocab = {'P' : 0, 'ich' : 1, 'mochte' : 2, 'ein' : 3, 'bier' : 4}
src_vocab_size = len(src_vocab)
tgt_vocab = {'P' : 0, 'i' : 1, 'want' : 2, 'a' : 3, 'beer' : 4, 'S' : 5, 'E' : 6}
number_dict = {i: w for i, w in enumerate(tgt_vocab)}
tgt_vocab_size = len(tgt_vocab)
I changed my code more clearly.
There are some mis-points in Transformer about Position Encoding, beacause of torch.LongTensor([[1,2,3,4,5]]) that the indexing of Embedding is a mixed issue.
So I fixed right with shape of get_sinusoid_encoding_table.
In Encoder, self.pos_emb(torch.LongTensor([[5,1,2,3,4]])) is right as ich mochte ein bier P and Decoder, self.pos_emb(torch.LongTensor([[5,1,2,3,4]])) is right as S i want a beer
2. Too heavy BERT as tutorial
In original paper, maxlen is 512, n_layer(number of layers) are 12, but in this tutorial, that is too heavy to run,, so I fiex below this.
1. mistake in Transformer
I changed my code more clearly. There are some mis-points in Transformer about Position Encoding, beacause of
torch.LongTensor([[1,2,3,4,5]])
that the indexing of Embedding is a mixed issue.So I fixed right with shape of
get_sinusoid_encoding_table
. In Encoder,self.pos_emb(torch.LongTensor([[5,1,2,3,4]]))
is right asich mochte ein bier P
and Decoder,self.pos_emb(torch.LongTensor([[5,1,2,3,4]]))
is right asS i want a beer
2. Too heavy BERT as tutorial
In original paper,
maxlen
is 512,n_layer
(number of layers) are 12, but in this tutorial, that is too heavy to run,, so I fiex below this.Also other implementation repository about BERT, when pre processing about masking,
[CLS]
,[SEP]
,[PAD]
should not to be changed asMASK
https://github.com/dhlee347/pytorchic-bert/blob/master/pretrain.py#L132 this code is right, so i fixed it.
Then, I added SEGMENT MASK for masking where token is zero padding. This is very import problem.