microsoft / DeBERTa

The implementation of DeBERTa
MIT License
1.96k stars 220 forks source link

[bug] incomplete code #20

Open shenfe opened 3 years ago

shenfe commented 3 years ago

In deberta.mlm, MaskedLayerNorm is not imported from deberta.ops, and PreLayerNorm is undefined.

And I'm not sure if deberta.mlm contains codes for pretraining?

YeDeming commented 3 years ago

Same question. I modify the Huggface's code to load all the available values except deberta.embeddings.position_embeddings.weight and run:

from transformers import DebertaTokenizer, DebertaForMaskedLM
import torch

tokenizer = DebertaTokenizer.from_pretrained('microsoft/deberta-base')
model = DebertaForMaskedLM.from_pretrained('microsoft/deberta-base')

inputs = tokenizer("The capital of France is [MASK].", return_tensors="pt")
labels = tokenizer("The capital of France is Paris.", return_tensors="pt")["input_ids"]

outputs = model(**inputs, labels=labels)
print (outputs.loss)

The loss is 3.85, did I do something wrong?

Thanks, Deming