Closed robinsongh381 closed 2 years ago
I am also waiting for v3 pretraining codes and would absolutely LOVE to see it integrated to huggingface FLAX !
It's for v1 &v2, we are still working on the v3 code, will release it once we passed the internal processing.
It's for v1 &v2, we are still working on the v3 code, will release it once we passed the internal processing.
I do not get it. Why do you work on tis models when they just gives random predictions. You do not need to work on models at all, it would me much faster to just to replace [MASK] with random(vocabulary) and you would get the same results with the same extremely bad accuracy.
You can at least make unigram model of this data and it would be much more accurate this random model.
from transformers import pipeline
unmasker = pipeline('fill-mask', model='deberta-base')
the_out = unmasker("The capital of France is [MASK].")
print("the_out",the_out)
As you can see the deberta results is completely wrong, there is some big error in porting it to transformers.
the_out [{'score': 0.001861382625065744, 'token': 18929, 'token_str': 'ABC', 'sequence': 'The capital of France isABC.'}, {'score': 0.0012871784856542945, 'token': 15804, 'token_str': ' plunge', 'sequence': 'The capital of France is plunge.'}, {'score': 0.001228992477990687, 'token': 47366, 'token_str': 'amaru', 'sequence': 'The capital of France isamaru.'}, {'score': 0.0010126306442543864, 'token': 46703, 'token_str': 'bians', 'sequence': 'The capital of France isbians.'}, {'score': 0.0008897537481971085, 'token': 43107, 'token_str': 'insured', 'sequence': 'The capital of France isinsured.'}]
@BigBird01
Hello ! Thank you for sharing a great piece of work.
I was wondering whether the MLM pre-training code is for training DeBERTa v3 or v2 ? (or v1)
Regards