openai / finetune-transformer-lm

Code and model for the paper "Improving Language Understanding by Generative Pre-Training"
https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf
MIT License
2.14k stars 499 forks source link

Position embedding matrix Wp was not used in the code? #25

Closed thanhnguyentang closed 5 years ago

thanhnguyentang commented 5 years ago

Hey, it seems from the code that the position embedding matrix W_p was not used. Am I correct?

h_0 = UW_e +W_p 
h_l = transformer_block(h_{l−1})∀i ∈ [1, n]
P(u) = softmax(h_n W_e^T )

Thank you.

thanhnguyentang commented 5 years ago

I found it is implicitly implemented in transform_roc; the concern is fully addressed, so I close this issue.