Open ylwangy opened 3 years ago
The paper says you add the absolte position embeddings after all Transformer layers, before softmax layer for MLM, however, I could not find these parameters.
looking forward to your response. Thank you
The paper says you add the absolte position embeddings after all Transformer layers, before softmax layer for MLM, however, I could not find these parameters.
looking forward to your response. Thank you