microsoft / DeBERTa

The implementation of DeBERTa
MIT License
1.91k stars 215 forks source link

How to pretrain DeBERTa v3 ?? #108

Closed BinhMinhs10 closed 1 year ago

BinhMinhs10 commented 1 year ago

I am planning to pretrain DeBERTa v3 with RTD and Gradient disentagled embedding sharing. But i don't have and proper references and resources on how to start pretraining it.

Opdoop commented 1 year ago

I find the document. But sadly, pre-training-with-replaced-token-detection-task freezed at Coming soon... state.

BigBird01 commented 1 year ago

updated.