sinovation / ZEN

A BERT-based Chinese Text Encoder Enhanced by N-gram Representations
Apache License 2.0
642 stars 104 forks source link

size mismatch for classifier.bias: copying a param with shape torch.Size #20

Open Jorigorn opened 4 years ago

Jorigorn commented 4 years ago

请问可以直接执行分类任务吗?还是必须finetun. 我下载了所有数据,直接执行这个报错:

python run_sequence_level_classification.py \ --task_name ChnSentiCorp \ --do_train \ --do_eval \ --do_lower_case \ --data_dir /path/to/dataset/ChnSentiCorp \ --bert_model /path/to/zen_model \ --max_seq_length 512 \ --train_batch_size 32 \ --learning_rate 2e-5 \ --num_train_epochs 30.0

07/20/2020 22:14:06 - INFO - ZEN.tokenization - loading vocabulary file /data/ceph/arikchen/TitleScoring_withData/zen_ngram/ZEN_ft_NLI_v0.1.0/vocab.txt 07/20/2020 22:14:06 - INFO - ZEN.ngram_utils - loading ngram frequency file /data/ceph/arikchen/TitleScoring_withData/zen_ngram/ZEN_ft_NLI_v0.1.0/ngram.txt 07/20/2020 22:14:08 - INFO - ZEN.modeling - loading weights file /data/ceph/arikchen/TitleScoring_withData/zen_ngram/ZEN_ft_NLI_v0.1.0/pytorch_model.bin 07/20/2020 22:14:08 - INFO - ZEN.modeling - loading configuration file /data/ceph/arikchen/TitleScoring_withData/zen_ngram/ZEN_ft_NLI_v0.1.0/config.json 07/20/2020 22:14:08 - INFO - ZEN.modeling - Model config { "attention_probs_dropout_prob": 0.1, "directionality": "bidi", "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "num_attention_heads": 12, "num_hidden_layers": 12, "num_hidden_word_layers": 6, "pooler_fc_size": 768, "pooler_num_attention_heads": 12, "pooler_num_fc_layers": 3, "pooler_size_per_head": 128, "pooler_type": "first_token_transform", "type_vocab_size": 2, "vocab_size": 21128, "word_size": 104089 }

Traceback (most recent call last): File "examples/run_sequence_level_classification.py", line 396, in main() File "examples/run_sequence_level_classification.py", line 361, in main if task_name not in processors: File "/data/anaconda3/lib/python3.6/site-packages/ZEN-0.1.0-py3.6.egg/ZEN/modeling.py", line 839, in from_pretrained RuntimeError: Error(s) in loading state_dict for ZenForSequenceClassification: size mismatch for classifier.weight: copying a param with shape torch.Size([3, 768]) from checkpoint, the shape in current model is torch.Size([2, 768]). size mismatch for classifier.bias: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([2]). sh-4.2$

thanks a lot.

shizhediao commented 2 years ago

It seems that you used the wrong checkpoint for your task. Your task is ChnSentiCorp while you are loading ZEN_ft_NLI_v0.1.0. Please try [ZEN_ft_SA](http://zen.chuangxin.com/ZEN/models/ZEN_ft_SA_v0.1.0.zip)