作者大大您好！我在训练我自己的数据时，出现了以下的问题，我尝试着用pop函数来解决，但始一直解决不了，麻烦你帮我看看问题所在！！

shitft commented 1 year ago

RuntimeError: Error(s) in loading state_dict for NormalNerModel: size mismatch for linear.weight: copying a param with shape torch.Size([33, 256]) from checkpoint, the shape in current model is torch.Size([25, 256]). size mismatch for linear.bias: copying a param with shape torch.Size([33]) from checkpoint, the shape in current model is torch.Size([25]). size mismatch for crf.start_transitions: copying a param with shape torch.Size([33]) from checkpoint, the shape in current model is torch.Size([25]). size mismatch for crf.end_transitions: copying a param with shape torch.Size([33]) from checkpoint, the shape in current model is torch.Size([25]). size mismatch for crf.transitions: copying a param with shape torch.Size([33, 33]) from checkpoint, the shape in current model is torch.Size([25, 25]).

taishan1994 commented 1 year ago

RuntimeError: Error(s) in loading state_dict for NormalNerModel: size mismatch for linear.weight: copying a param with shape torch.Size([33, 256]) from checkpoint, the shape in current model is torch.Size([25, 256]). size mismatch for linear.bias: copying a param with shape torch.Size([33]) from checkpoint, the shape in current model is torch.Size([25]). size mismatch for crf.start_transitions: copying a param with shape torch.Size([33]) from checkpoint, the shape in current model is torch.Size([25]). size mismatch for crf.end_transitions: copying a param with shape torch.Size([33]) from checkpoint, the shape in current model is torch.Size([25]). size mismatch for crf.transitions: copying a param with shape torch.Size([33, 33]) from checkpoint, the shape in current model is torch.Size([25, 25]).

实体的标签数目不对。

shitft commented 1 year ago

请问要怎么解决呀！我是小白一枚，实在不知道改哪里！

发自我的iPhone

------------------ 原始邮件 ------------------ 发件人: 西西嘛呦 @.> 发送时间: 2022年11月17日 19:30 收件人: taishan1994/pytorch_bert_bilstm_crf_ner @.> 抄送: shitft @.>, Author @.> 主题: 回复：[taishan1994/pytorch_bert_bilstm_crf_ner] 作者大大您好！我在训练我自己的数据时，出现了以下的问题，我尝试着用pop函数来解决，但始一直解决不了，麻烦你帮我看看问题所在！！ (Issue #26)

RuntimeError: Error(s) in loading state_dict for NormalNerModel: size mismatch for linear.weight: copying a param with shape torch.Size([33, 256]) from checkpoint, the shape in current model is torch.Size([25, 256]). size mismatch for linear.bias: copying a param with shape torch.Size([33]) from checkpoint, the shape in current model is torch.Size([25]). size mismatch for crf.start_transitions: copying a param with shape torch.Size([33]) from checkpoint, the shape in current model is torch.Size([25]). size mismatch for crf.end_transitions: copying a param with shape torch.Size([33]) from checkpoint, the shape in current model is torch.Size([25]). size mismatch for crf.transitions: copying a param with shape torch.Size([33, 33]) from checkpoint, the shape in current model is torch.Size([25, 25]).

实体的标签数目不对。

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

taishan1994 commented 1 year ago

请问要怎么解决呀！我是小白一枚，实在不知道改哪里！发自我的iPhone … ------------------ 原始邮件 ------------------ 发件人: 西西嘛呦 @.> 发送时间: 2022年11月17日 19:30 收件人: taishan1994/pytorch_bert_bilstm_crf_ner @.> 抄送: shitft @.>, Author @.> 主题: 回复：[taishan1994/pytorch_bert_bilstm_crf_ner] 作者大大您好！我在训练我自己的数据时，出现了以下的问题，我尝试着用pop函数来解决，但始一直解决不了，麻烦你帮我看看问题所在！！ (Issue #26) RuntimeError: Error(s) in loading state_dict for NormalNerModel: size mismatch for linear.weight: copying a param with shape torch.Size([33, 256]) from checkpoint, the shape in current model is torch.Size([25, 256]). size mismatch for linear.bias: copying a param with shape torch.Size([33]) from checkpoint, the shape in current model is torch.Size([25]). size mismatch for crf.start_transitions: copying a param with shape torch.Size([33]) from checkpoint, the shape in current model is torch.Size([25]). size mismatch for crf.end_transitions: copying a param with shape torch.Size([33]) from checkpoint, the shape in current model is torch.Size([25]). size mismatch for crf.transitions: copying a param with shape torch.Size([33, 33]) from checkpoint, the shape in current model is torch.Size([25, 25]). 实体的标签数目不对。 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

nor_ent2id.json里面有多少个标签就设置num_tags=多少

shitft commented 1 year ago

好的，谢谢作者大大，我吃个饭回去试试！

发自我的iPhone

------------------ 原始邮件 ------------------ 发件人: 西西嘛呦 @.> 发送时间: 2022年11月17日 19:35 收件人: taishan1994/pytorch_bert_bilstm_crf_ner @.> 抄送: shitft @.>, Author @.> 主题: 回复：[taishan1994/pytorch_bert_bilstm_crf_ner] 作者大大您好！我在训练我自己的数据时，出现了以下的问题，我尝试着用pop函数来解决，但始一直解决不了，麻烦你帮我看看问题所在！！ (Issue #26)

请问要怎么解决呀！我是小白一枚，实在不知道改哪里！发自我的iPhone … ------------------ 原始邮件 ------------------ 发件人: 西西嘛呦 @.> 发送时间: 2022年11月17日 19:30 收件人: taishan1994/pytorch_bert_bilstm_crf_ner @.> 抄送: shitft @.>, Author @.> 主题: 回复：[taishan1994/pytorch_bert_bilstm_crf_ner] 作者大大您好！我在训练我自己的数据时，出现了以下的问题，我尝试着用pop函数来解决，但始一直解决不了，麻烦你帮我看看问题所在！！ (Issue #26) RuntimeError: Error(s) in loading state_dict for NormalNerModel: size mismatch for linear.weight: copying a param with shape torch.Size([33, 256]) from checkpoint, the shape in current model is torch.Size([25, 256]). size mismatch for linear.bias: copying a param with shape torch.Size([33]) from checkpoint, the shape in current model is torch.Size([25]). size mismatch for crf.start_transitions: copying a param with shape torch.Size([33]) from checkpoint, the shape in current model is torch.Size([25]). size mismatch for crf.end_transitions: copying a param with shape torch.Size([33]) from checkpoint, the shape in current model is torch.Size([25]). size mismatch for crf.transitions: copying a param with shape torch.Size([33, 33]) from checkpoint, the shape in current model is torch.Size([25, 25]). 实体的标签数目不对。 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

nor_ent2id.json里面有多少个标签就设置num_tags=多少

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

shitft commented 1 year ago

又来打扰您啦！我在main.sh和bert_ner_model中修改了mum=tags的参数，没有解决问题。我是在windows中运行项目的，下载了git后在pycharm终端已经能正常运行main.sh文件了，但是一运行main.py还是会出现size mismatch的错误！您有空时帮我看看

taishan1994 commented 1 year ago

又来打扰您啦！我在main.sh和bert_ner_model中修改了mum=tags的参数，没有解决问题。我是在windows中运行项目的，下载了git后在pycharm终端已经能正常运行main.sh文件了，但是一运行main.py还是会出现size mismatch的错误！您有空时帮我看看

不要在pycharm里面直接运行，将指令复制出来在终端里面运行。

shitft commented 1 year ago

作者大大，我在将model_name设为bilstm、idcnn、crf训练时会出现model.pt不生成的问题，之前用您的数据训练时，也出现了checkpoints下有的模型有.pt文件，有的没有的问题，请问是因为什么呐

另外就是，我的显卡是rtx2060s 8G的，我把model_name修改为bert,Albert等模型时，就会报错说我的内存不足，我看了csdn上的几个解决策略，都没能解决我目前的问题！请问除了换显卡，还能通过其他方法来解决吗

taishan1994 commented 1 year ago

作者大大，我在将model_name设为bilstm、idcnn、crf训练时会出现model.pt不生成的问题，之前用您的数据训练时，也出现了checkpoints下有的模型有.pt文件，有的没有的问题，请问是因为什么呐

另外就是，我的显卡是rtx2060s 8G的，我把model_name修改为bert,Albert等模型时，就会报错说我的内存不足，我看了csdn上的几个解决策略，都没能解决我目前的问题！请问除了换显卡，还能通过其他方法来解决吗

1、打印下train里面保存模型的那里看看是否有执行保存模型。train里面有个eval_steps，如果总的step小于它是不会有模型生成的。 2、调小train_batch_size和eval_batch_size直到显存够用为止。

shitft commented 1 year ago

好的，您说的第二个方法我尝试了，2,4,8,16,32这几个数字我都有尝试，最后的Tried to allocate 全是20MiB，很奇怪。不管换成几，都需要852MB

发自我的iPhone

------------------ 原始邮件 ------------------ 发件人: 西西嘛呦 @.> 发送时间: 2022年11月19日 11:33 收件人: taishan1994/pytorch_bert_bilstm_crf_ner @.> 抄送: shitft @.>, Author @.> 主题: 回复：[taishan1994/pytorch_bert_bilstm_crf_ner] 作者大大您好！我在训练我自己的数据时，出现了以下的问题，我尝试着用pop函数来解决，但始一直解决不了，麻烦你帮我看看问题所在！！ (Issue #26)

作者大大，我在将model_name设为bilstm、idcnn、crf训练时会出现model.pt不生成的问题，之前用您的数据训练时，也出现了checkpoints下有的模型有.pt文件，有的没有的问题，请问是因为什么呐

另外就是，我的显卡是rtx2060s 8G的，我把model_name修改为bert,Albert等模型时，就会报错说我的内存不足，我看了csdn上的几个解决策略，都没能解决我目前的问题！请问除了换显卡，还能通过其他方法来解决吗

1、打印下train里面保存模型的那里看看是否有执行保存模型。train里面有个eval_steps，如果总的step小于它是不会有模型生成的。 2、调小train_batch_size和eval_batch_size直到显存够用为止。

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

taishan1994 / pytorch_bert_bilstm_crf_ner

作者大大您好！我在训练我自己的数据时，出现了以下的问题，我尝试着用pop函数来解决，但始一直解决不了，麻烦你帮我看看问题所在！！ #26