ajhalthor / Transformer-Neural-Network

Code Transformer neural network components piece by piece
MIT License
295 stars 157 forks source link

Word embeddings input size #6

Open MLRadfys opened 1 year ago

MLRadfys commented 1 year ago

Hi and thanks for the great series about transformers!

I noticed that you initialize the nn.embeddings layer for the word embeddings with an input size that is equal to the vocabulary size. As we would like to add the positional encodings with dimensions max_seq_length X 512 on top of the embeddings, the dimensions of the words embeddings should be the same as the ones for the positional embeddings (max_seq_length X 512).

So the corrected code would look something like:

class SentenceEmbedding(nn.Module):
    "For a given sentence, create an embedding"
    def __init__(self, max_sequence_length, d_model, language_to_index, START_TOKEN, END_TOKEN, PADDING_TOKEN):
        super().__init__()
        self.vocab_size = len(language_to_index)
        self.max_sequence_length = max_sequence_length
        self.embedding = nn.Embedding(self.max_sequence_length, d_model) #CORRECTED LINE....
        ...

Regards,

M

MLRadfys commented 1 year ago

Hi again,

I just downloaded yo're repo and tried to run the code from there. Everything works like it should. The strange thing is that I get an index error in the embedding layer.

I must have missed something...

But do you know why the embedding layer has the input dimension of the vocabulary size and not the max sequence length?

Regards,

M