load albert model error

phc4valid commented 3 years ago

File "run_kbert_cls.py", line 261, in main model.load_state_dict(torch.load(args.pretrained_model_path), strict=False) File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1044, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for Model: size mismatch for embedding.segment_embedding.weight: copying a param with shape torch.Size([1, 128]) from checkpoint, the shape in current model is torch.Size([3, 128]).

您好, 這是我載入albert-base-chinese 時發生的問題 ,使用的config.json為uer所提供之 {"emb_size": 128, "feedforward_size": 3072, "hidden_size": 768, "heads_num": 12, "layers_num": 12, "dropout": 0.0} 想請問這裡的segment_embedding 能由哪裡做修改? 很感謝妳們的付出萬分感謝 stay safe

hhou435 commented 3 years ago

您好，这是转换脚本中的一个bug，已经更新了转换脚本，您可以重新测试一下，非常感谢您对项目的关注

phc4valid commented 3 years ago

您好,我用了新的轉換腳本,後續有新的錯誤 Traceback (most recent call last): File "run_kbert_cls.py", line 625, in main() File "run_kbertcls.py", line 588, in main loss, = model(input_ids_batch, label_ids_batch, mask_ids_batch, pos=pos_ids_batch, vm=vms_batch) File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, kwargs) File "run_kbert_cls.py", line 53, in forward output = self.encoder(emb, mask, vm) File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, *kwargs) File "/workplace/phchen/K-BERT-master/uer/encoders/bert_encoder.py", line 48, in forward hidden = self.transformer[i](hidden, mask) File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(input, kwargs) File "/workplace/phchen/K-BERT-master/uer/layers/transformer.py", line 38, in forward inter = self.dropout_1(self.self_attn(hidden, hidden, hidden, mask)) File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, *kwargs) File "/workplace/phchen/K-BERT-master/uer/layers/multi_headed_attn.py", line 51, in forward query, key, value = [l(x). \ File "/workplace/phchen/K-BERT-master/uer/layers/multi_headed_attn.py", line 51, in query, key, value = [l(x). \ File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(input, **kwargs) File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 91, in forward return F.linear(input, self.weight, self.bias) File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/functional.py", line 1676, in linear output = input.matmul(weight.t()) RuntimeError: size mismatch, m1: [4096 x 128], m2: [768 x 768] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:41

使用之config.json為 {"emb_size": 128, "feedforward_size": 3072, "hidden_size": 768, "heads_num": 12, "layers_num": 12, "dropout": 0.0} 想請問m1與m2指的為何? 該如何解決問題萬分謝謝您的時間謝謝 @hhou435

hhou435 commented 3 years ago

您好，可以提供一下您的运行命令吗

phc4valid commented 3 years ago

https://github.com/autoliuweijie/K-BERT?utm_source=catalyzex.com 您好 , 我執行的是連結裡的run_kbert_cls.py . 請問我應該要額外提供什麼資訊給您比較好解決問題呢 ? 感謝您

hhou435 commented 3 years ago

有关albert的微调可以参考这里https://github.com/dbiir/UER-py/wiki/下游任务微调

LSQii commented 3 years ago

您好,我用了新的轉換腳本,後續有新的錯誤 Traceback (most recent call last): File "run_kbert_cls.py", line 625, in main() File "run_kbertcls.py", line 588, in main loss, = model(input_ids_batch, label_ids_batch, mask_ids_batch, pos=pos_ids_batch, vm=vms_batch) File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, kwargs) File "run_kbert_cls.py", line 53, in forward output = self.encoder(emb, mask, vm) File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, *kwargs) File "/workplace/phchen/K-BERT-master/uer/encoders/bert_encoder.py", line 48, in forward hidden = self.transformer[i](hidden, mask) File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(input, kwargs) File "/workplace/phchen/K-BERT-master/uer/layers/transformer.py", line 38, in forward inter = self.dropout_1(self.self_attn(hidden, hidden, hidden, mask)) File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, *kwargs) File "/workplace/phchen/K-BERT-master/uer/layers/multi_headed_attn.py", line 51, in forward query, key, value = [l(x). File "/workplace/phchen/K-BERT-master/uer/layers/multi_headed_attn.py", line 51, in query, key, value = [l(x). File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(input, **kwargs) File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 91, in forward return F.linear(input, self.weight, self.bias) File "/opt/conda/envs/phchen-k/lib/python3.8/site-packages/torch/nn/functional.py", line 1676, in linear output = input.matmul(weight.t()) RuntimeError: size mismatch, m1: [4096 x 128], m2: [768 x 768] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:41

使用之config.json為 {"emb_size": 128, "feedforward_size": 3072, "hidden_size": 768, "heads_num": 12, "layers_num": 12, "dropout": 0.0} 想請問m1與m2指的為何? 該如何解決問題萬分謝謝您的時間謝謝 @hhou435

hello冒昧打扰一下，想问一下你这里问题解决了嘛，我在用albert做pretrain model的时候出现了相同的问题

dbiir / UER-py

load albert model error #147