microsoft / DeBERTa

The implementation of DeBERTa
MIT License
1.91k stars 215 forks source link

Can you tell me which token represents the overall representation of the sentence in the task of feature-extraction? The first token or the last token? #112

Open junzai0215 opened 1 year ago

junzai0215 commented 1 year ago

Thank you very much. 1663255811283