ValueError: Shape of variable bert/pooler/dense/bias:0 ((128,)) doesn't match with shape of tensor bert/pooler/dense/bias ([768]) from checkpoint reader.

parkourcx commented 4 years ago

I have pretrained an albert using my own data(chinese language), then I wanted to use this pretained model into downstream tasks, but I got this error when I load my pre trained Albert file. I guess it's because of some config params issue, but I don't know how to fix it. my pre-trained Albert-config is: { "attention_probs_dropout_prob": 0.1, "directionality": "bidi", "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "max_position_embeddings": 512, "num_attention_heads": 12, "num_hidden_layers": 12, "pooler_fc_size": 768, "pooler_num_attention_heads": 12, "pooler_num_fc_layers": 3, "pooler_size_per_head": 128, "pooler_type": "first_token_transform", "type_vocab_size": 2, "vocab_size": 20974 } I have tried to change "hidden size" and "pooler_fc_size" etc., but it didn't work. how can I fix this issue?

0x0539 commented 4 years ago

What was this error? Can you post a stacktrace?

parkourcx commented 4 years ago

Thanks for replying. I recently tried to train an ancient Chinese language Albert with my own corpus using google-research Albert, the error occurred when I used it into downstream tasks with another repo( https://github.com/brightmart/albert_zh). I have emailed the owner of this repo, the reason seems to be that there are some code changes in albert_zh, so google’s pre-trained albert can not be used in albert_zh. Now the albert_zh training is in progress, I would try again. I will keep u updated, good day sir.

Sebastian Goodman notifications@github.com于2020年2月7日周五02:04写道：

What was this error? Can you post a stacktrace?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/google-research/ALBERT/issues/142?email_source=notifications&email_token=ACYRJIG5RPXL6XC57WDX7C3RBRGL3A5CNFSM4KPXTXK2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELAGKSY#issuecomment-583034187, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACYRJIDMUVXRV5RN5IXEK23RBRGL3ANCNFSM4KPXTXKQ .

-- Sent from my iPhone

ahzz1207 commented 4 years ago

Thanks for replying. I recently tried to train an ancient Chinese language Albert with my own corpus using google-research Albert, the error occurred when I used it into downstream tasks with another repo( https://github.com/brightmart/albert_zh). I have emailed the owner of this repo, the reason seems to be that there are some code changes in albert_zh, so google’s pre-trained albert can not be used in albert_zh. Now the albert_zh training is in progress, I would try again. I will keep u updated, good day sir. Sebastian Goodman notifications@github.com于2020年2月7日周五02:04写道： What was this error? Can you post a stacktrace? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#142?email_source=notifications&email_token=ACYRJIG5RPXL6XC57WDX7C3RBRGL3A5CNFSM4KPXTXK2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELAGKSY#issuecomment-583034187>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACYRJIDMUVXRV5RN5IXEK23RBRGL3ANCNFSM4KPXTXKQ . -- Sent from my iPhone I can help u 我就用中文吧。你可以对比这两版albert的代码，你就发现他们做embedding的过程是不一样的，brightman在embedding的时候就从128映射到768了，而google是在tramsformer里做的映射。

google-research / albert

ValueError: Shape of variable bert/pooler/dense/bias:0 ((128,)) doesn't match with shape of tensor bert/pooler/dense/bias ([768]) from checkpoint reader. #142