Open ChengsongLu opened 1 year ago
What you are doing is correct .. you are actually getting v3 weights , there's no DebertaV3Model on Huggingface yet.
It seems V3 is the same architecture with V2?
DebertaV2Config {
"_name_or_path": "microsoft/deberta-v3-base",
"attention_probs_dropout_prob": 0.1,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
...
I successfully ran the following code.
import sentencepiece
from transformers import DebertaV2Model, DebertaV2Config, DebertaV2Tokenizer
MODEL_NAME = 'microsoft/deberta-v3-base'
model = DebertaV2Model.from_pretrained(MODEL_NAME)
config = DebertaV2Config.from_pretrained(MODEL_NAME)
tokenizer = DebertaV2Tokenizer.from_pretrained(MODEL_NAME)
Output:
Downloading spm.model: 100%
2.46M/2.46M [00:00<00:00, 22.7MB/s]
Downloading (…)okenizer_config.json: 100%
52.0/52.0 [00:00<00:00, 2.33kB/s]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Hi,
model_name = 'microsoft/deberta-v3-large' tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModel.from_pretrained(model_name)
When I load the v3 model, it return a V2 model instead, how can I use the v3 model and tokenizer correctlly?