Open youssefabdelm opened 3 years ago
Hi @youssefabdelm There is no direct plan to train and provide models for DeBERTa.
The improvement for STS-b is negligible compared to other models (RoBERTa large has 92.4 Pearson), and in any practical setting you will probably not see any difference.
But as it is part of HuggingFace, training your own models based on DeBERTa should be rather simple.
@nreimers Ah okay! Thanks! Did not know RoBERTa large had 92.4 on Pearson. I remember switching from Google's Universal Sentence Encoder to RoBERTa large as having a noticeable effect (numerically it was from 75 to 86 accuracy if I recall correctly), that combined with the fact that this is my favorite package by far made me curious if there'd be a difference but good to know RoBERTa large is on par with it.
Yeah I might fine tune DeBERTa!
It seems Microsoft has been able to push the performance on STS-b to 92.5: https://github.com/microsoft/DeBERTa
Their models are available via huggingface, but I'm not sure if their STS fine-tuned model is.