songyingxin / Bert-TextClassification

Implemention some Baseline Model upon Bert for Text Classification
Other
682 stars 149 forks source link

load_state_dict error - some problems on parameters #21

Open samaritanhu opened 4 years ago

samaritanhu commented 4 years ago

Thanks for sharing, but I got some problems for BertOrigin model (others as well)

Firstly, I copy the config.json into .sst_output, then this error appeared. It seems like a very common problem because I saw this error as well in this repo: https://github.com/naver/sqlova/issues/1 . Even if I changed gamma into weights, beta into bias, this model cannot work as well. The error information is NotImplementedError

RuntimeError: Error(s) in loading state_dict for BertOrigin: Missing key(s) in state_dict: "bert.embeddings.LayerNorm.weight", "bert.embeddings.LayerNorm.bias", "bert.encoder.layer.0.attention.output.LayerNorm.weight", "bert.encoder.layer.0.attention.output.LayerNorm.bias", "bert.encoder.layer.0.output.LayerNorm.weight", "bert.encoder.layer.0.output.LayerNorm.bias", "bert.encoder.layer.1.attention.output.LayerNorm.weight", "bert.encoder.layer.1.attention.output.LayerNorm.bias", "bert.encoder.layer.1.output.LayerNorm.weight", "bert.encoder.layer.1.output.LayerNorm.bias", "bert.encoder.layer.2.attention.output.LayerNorm.weight", "bert.encoder.layer.2.attention.output.LayerNorm.bias", "bert.encoder.layer.2.output.LayerNorm.weight", "bert.encoder.layer.2.output.LayerNorm.bias", "bert.encoder.layer.3.attention.output.LayerNorm.weight", "bert.encoder.layer.3.attention.output.LayerNorm.bias", "bert.encoder.layer.3.output.LayerNorm.weight", "bert.encoder.layer.3.output.LayerNorm.bias", "bert.encoder.layer.4.attention.output.LayerNorm.weight", "bert.encoder.layer.4.attention.output.LayerNorm.bias", "bert.encoder.layer.4.output.LayerNorm.weight", "bert.encoder.layer.4.output.LayerNorm.bias", "bert.encoder.layer.5.attention.output.LayerNorm.weight", "bert.encoder.layer.5.attention.output.LayerNorm.bias", "bert.encoder.layer.5.output.LayerNorm.weight", "bert.encoder.layer.5.output.LayerNorm.bias", "bert.encoder.layer.6.attention.output.LayerNorm.weight", "bert.encoder.layer.6.attention.output.LayerNorm.bias", "bert.encoder.layer.6.output.LayerNorm.weight", "bert.encoder.layer.6.output.LayerNorm.bias", "bert.encoder.layer.7.attention.output.LayerNorm.weight", "bert.encoder.layer.7.attention.output.LayerNorm.bias", "bert.encoder.layer.7.output.LayerNorm.weight", "bert.encoder.layer.7.output.LayerNorm.bias", "bert.encoder.layer.8.attention.output.LayerNorm.weight", "bert.encoder.layer.8.attention.output.LayerNorm.bias", "bert.encoder.layer.8.output.LayerNorm.weight", "bert.encoder.layer.8.output.LayerNorm.bias", "bert.encoder.layer.9.attention.output.LayerNorm.weight", "bert.encoder.layer.9.attention.output.LayerNorm.bias", "bert.encoder.layer.9.output.LayerNorm.weight", "bert.encoder.layer.9.output.LayerNorm.bias", "bert.encoder.layer.10.attention.output.LayerNorm.weight", "bert.encoder.layer.10.attention.output.LayerNorm.bias", "bert.encoder.layer.10.output.LayerNorm.weight", "bert.encoder.layer.10.output.LayerNorm.bias", "bert.encoder.layer.11.attention.output.LayerNorm.weight", "bert.encoder.layer.11.attention.output.LayerNorm.bias", "bert.encoder.layer.11.output.LayerNorm.weight", "bert.encoder.layer.11.output.LayerNorm.bias", "classifier.weight", "classifier.bias". Unexpected key(s) in state_dict: "cls.predictions.bias", "cls.predictions.transform.dense.weight", "cls.predictions.transform.dense.bias", "cls.predictions.transform.LayerNorm.gamma", "cls.predictions.transform.LayerNorm.beta", "cls.predictions.decoder.weight", "cls.seq_relationship.weight", "cls.seq_relationship.bias", "bert.embeddings.LayerNorm.gamma", "bert.embeddings.LayerNorm.beta", "bert.encoder.layer.0.attention.output.LayerNorm.gamma", "bert.encoder.layer.0.attention.output.LayerNorm.beta", "bert.encoder.layer.0.output.LayerNorm.gamma", "bert.encoder.layer.0.output.LayerNorm.beta", "bert.encoder.layer.1.attention.output.LayerNorm.gamma", "bert.encoder.layer.1.attention.output.LayerNorm.beta", "bert.encoder.layer.1.output.LayerNorm.gamma", "bert.encoder.layer.1.output.LayerNorm.beta", "bert.encoder.layer.2.attention.output.LayerNorm.gamma", "bert.encoder.layer.2.attention.output.LayerNorm.beta", "bert.encoder.layer.2.output.LayerNorm.gamma", "bert.encoder.layer.2.output.LayerNorm.beta", "bert.encoder.layer.3.attention.output.LayerNorm.gamma", "bert.encoder.layer.3.attention.output.LayerNorm.beta", "bert.encoder.layer.3.output.LayerNorm.gamma", "bert.encoder.layer.3.output.LayerNorm.beta", "bert.encoder.layer.4.attention.output.LayerNorm.gamma", "bert.encoder.layer.4.attention.output.LayerNorm.beta", "bert.encoder.layer.4.output.LayerNorm.gamma", "bert.encoder.layer.4.output.LayerNorm.beta", "bert.encoder.layer.5.attention.output.LayerNorm.gamma", "bert.encoder.layer.5.attention.output.LayerNorm.beta", "bert.encoder.layer.5.output.LayerNorm.gamma", "bert.encoder.layer.5.output.LayerNorm.beta", "bert.encoder.layer.6.attention.output.LayerNorm.gamma", "bert.encoder.layer.6.attention.output.LayerNorm.beta", "bert.encoder.layer.6.output.LayerNorm.gamma", "bert.encoder.layer.6.output.LayerNorm.beta", "bert.encoder.layer.7.attention.output.LayerNorm.gamma", "bert.encoder.layer.7.attention.output.LayerNorm.beta", "bert.encoder.layer.7.output.LayerNorm.gamma", "bert.encoder.layer.7.output.LayerNorm.beta", "bert.encoder.layer.8.attention.output.LayerNorm.gamma", "bert.encoder.layer.8.attention.output.LayerNorm.beta", "bert.encoder.layer.8.output.LayerNorm.gamma", "bert.encoder.layer.8.output.LayerNorm.beta", "bert.encoder.layer.9.attention.output.LayerNorm.gamma", "bert.encoder.layer.9.attention.output.LayerNorm.beta", "bert.encoder.layer.9.output.LayerNorm.gamma", "bert.encoder.layer.9.output.LayerNorm.beta", "bert.encoder.layer.10.attention.output.LayerNorm.gamma", "bert.encoder.layer.10.attention.output.LayerNorm.beta", "bert.encoder.layer.10.output.LayerNorm.gamma", "bert.encoder.layer.10.output.LayerNorm.beta", "bert.encoder.layer.11.attention.output.LayerNorm.gamma", "bert.encoder.layer.11.attention.output.LayerNorm.beta", "bert.encoder.layer.11.output.LayerNorm.gamma", "bert.encoder.layer.11.output.LayerNorm.beta".

van-der-Poel commented 1 year ago

I encountered the same problems