Open duyvuleo opened 1 year ago
Hi @duyvuleo
Currently, converting DeBERTa to Long DeBERTa is not possible because this model uses on a specific attention mecanism called "disentangled attention" which relies on different inputs + relative positional embedding.
To make DeBERTa compatible, some things need to be rethought specifically for this model. I may add DeBERTa in the future.
Hi,
Thanks for the great work.
Is it possible to convert DeBERTa models to longDeBERTa ones? Would you please help advise specific steps that I can follow?
Looking forward to your response. Thanks!