microsoft / DeBERTa

The implementation of DeBERTa
MIT License
1.91k stars 215 forks source link

why vocab.txt and tokenizer.json not in pretrained model in huggingface ?? #117

Open XuJianzhi opened 1 year ago

XuJianzhi commented 1 year ago

https://huggingface.co/microsoft/deberta-v2-xlarge/tree/main

If I run : tokenizer = AutoTokenizer.from_pretrained('microsoft/deberta-v2-xlarge')

get bug: ValueError: Couldn't instantiate the backend tokenizer from one of: (1) a tokenizers library serialization file, (2) a slow tokenizer instance to convert or (3) an equivalent slow tokenizer class to instantiate and convert. You need to have sentencepiece installed to convert a slow tokenizer to a fast one.

XuJianzhi commented 1 year ago

how to trans spm.model to tokenizer.json ??