Open MLRadfys opened 1 year ago
Hi again,
I just downloaded yo're repo and tried to run the code from there. Everything works like it should. The strange thing is that I get an index error in the embedding layer.
I must have missed something...
But do you know why the embedding layer has the input dimension of the vocabulary size and not the max sequence length?
Regards,
M
Hi and thanks for the great series about transformers!
I noticed that you initialize the nn.embeddings layer for the word embeddings with an input size that is equal to the vocabulary size. As we would like to add the positional encodings with dimensions
max_seq_length X 512
on top of the embeddings, the dimensions of the words embeddings should be the same as the ones for the positional embeddings (max_seq_length X 512
).So the corrected code would look something like:
Regards,
M