richardbaihe / paperreading

NLP papers
MIT License
2 stars 0 forks source link

Arxiv 2020 | DeBERTa: Decoding-enhanced BERT with Disentangled Attention #37

Closed richardbaihe closed 4 years ago

richardbaihe commented 4 years ago
image

1. Position embedding

image

2. replace output layer with two parameters-shared output layer