dbmdz / berts

DBMDZ BERT, DistilBERT, ELECTRA, GPT-2 and ConvBERT models
MIT License
155 stars 12 forks source link

Error(s) in loading state_dict for ElectraForTokenClassification #28

Open lijiayi980130 opened 4 years ago

lijiayi980130 commented 4 years ago

size mismatch for classifier.weight: copying a param with shape torch.Size([8, 1024]) from checkpoint, the shape in current model is torch.Size([9, 1024]). Hello,I want to know why your file,"config.json"only has 8 labels for conll2003 datasets,I think it should have 9 labels.

stefan-it commented 4 years ago

Hi @lijiayi980130 ,

good question, I think you're referring to our model:

https://huggingface.co/dbmdz/electra-large-discriminator-finetuned-conll03-english

The reason for 8 labels is, that the original dataset is IOB1 labelled (yes, there are some IOB2 labelled datasets on the internet, but these are not the official ones):

$ cat eng.t* | cut -d " " -f 4 | grep -v "^$" | sort | uniq
B-LOC
B-MISC
B-ORG
I-LOC
I-MISC
I-ORG
I-PER
O

I hope this clarifies the label list entries in the configuration file :hugs:

lijiayi980130 commented 4 years ago

Thank you ! I think I know