codertimo / BERT-pytorch

Google AI 2018 BERT pytorch implementation
Apache License 2.0
6.11k stars 1.29k forks source link

PositionalEmbedding #53

Open fgaoyang opened 5 years ago

fgaoyang commented 5 years ago

The position embedding in the BERT is not the same as in the transformer. Why not use the form in bert?

codertimo commented 5 years ago

@Yang92to Great Point, I'll check out the BERT positional embedding method, and update ASAP

jacklanchantin commented 4 years ago

@codertimo the BERT positional embedding method is to just learn an embedding for each position. So you can use nn.Embedding with a constant input sequence [0,1,2,...,L-1] where L is the maximum sequence length.

yonghee12 commented 3 years ago

@codertimo Since BERT uses learned positional embeddings and it is one of the biggest difference between original transformers and BERT, I think it is quite urgent to modify the positional embedding part.