Arxiv 2020 | DeBERTa: Decoding-enhanced BERT with Disentangled Attention - Githubissues

richardbaihe / paperreading

NLP papers

MIT License

2 stars 0 forks source link

Arxiv 2020 | DeBERTa: Decoding-enhanced BERT with Disentangled Attention #37

Closed richardbaihe closed 4 years ago

richardbaihe commented 4 years ago

1. Position embedding

2. replace output layer with two parameters-shared output layer