wzzzd / lm_ner

基于Pytorch的命名实体识别框架,支持LSTM+CRF、Bert+CRF、RoBerta+CRF等框架
75 stars 18 forks source link

请问预测时候pad都预测成index=0的标签了,怎么解决。 #7

Open StrongBirds opened 2 years ago

StrongBirds commented 2 years ago

自己看了很多遍代码,还是无法解决,期待你的回复 预测输入:[[101, 3851, 3736, 4689, 3343, 2336, 2356, 677, 1814, 3777, 1773, 6125, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 102] 预测label:[['O', 'B-prov', 'I-prov', 'E-prov', 'B-city', 'I-city', 'E-city', 'B-district', 'E-district', 'B-road', 'I-road', 'E-road', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O'] 预测结果:[['O', 'B-prov', 'I-prov', 'E-prov', 'B-city', 'I-city', 'E-city', 'B-district', 'E-district', 'B-road', 'I-road', **'B-prov', 'B-prov', 'B-prov', 'B-prov', 'B-prov', 'B-prov', 'B-prov', 'B-prov', 'B-prov', 'B-prov', 'B-prov', 'B-prov', 'B-prov', 'B-prov', 'B-prov', 'B-prov', 'B-prov', 'B-prov', 'B-prov', 'B-prov', 'B-prov', 'B-prov', 'B-prov', 'B-prov', 'B-prov']

StrongBirds commented 2 years ago

这是训练时候输入的数据 输入的input_ids:[[101, 865, 2001, 2356, 1920, 7391, 7252, 2339, 689, 1736, 1277, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 102] 输入的attent_mask:[[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1], 输入的labels:[[20, 16, 38, 9, 4, 6, 6, 6, 6, 6, 48, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20]

wzzzd commented 2 years ago

这是训练时候输入的数据 输入的input_ids:[[101, 865, 2001, 2356, 1920, 7391, 7252, 2339, 689, 1736, 1277, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 102] 输入的attent_mask:[[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1], 输入的labels:[[20, 16, 38, 9, 4, 6, 6, 6, 6, 6, 48, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20]

这个是Trainer.py文件248行的输入吗? (outputs = self.model(input_ids, labels=labels, attention_mask=attention_mask))

StrongBirds commented 2 years ago

这是训练时候输入的数据 输入的input_ids:[[101, 865, 2001, 2356, 1920, 7391, 7252, 2339, 689, 1736, 1277, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 102] 输入的attent_mask:[[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1], 输入的labels:[[20, 16, 38, 9, 4, 6, 6, 6, 6, 6, 48, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20]

这个是Trainer.py文件248行的输入吗? (outputs = self.model(input_ids, labels=labels, attention_mask=attention_mask))

`    def step(self, bs):
    """
    每一个batch的训练过程/步骤
    """
    # 输入
    input_ids = bs[0]
    attention_mask = bs[1]
    labels = bs[2]

    # 定义loss,并训练
    outputs = self.model(input_ids, labels=labels, attention_mask=attention_mask)   #
    loss = outputs[0]           # 获取每个token的logit输出结果
    if torch.cuda.device_count() > 1:
        loss = loss.mean()`
StrongBirds commented 2 years ago

这是训练时候输入的数据 输入的input_ids:[[101, 865, 2001, 2356, 1920, 7391, 7252, 2339, 689, 1736, 1277, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 102] 输入的attent_mask:[[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1], 输入的labels:[[20, 16, 38, 9, 4, 6, 6, 6, 6, 6, 48, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20]

这个是Trainer.py文件248行的输入吗? (outputs = self.model(input_ids, labels=labels, attention_mask=attention_mask))

`    def step(self, bs):
    """
    每一个batch的训练过程/步骤
    """
    # 输入
    input_ids = bs[0]
    attention_mask = bs[1]
    labels = bs[2]

    # 定义loss,并训练
    outputs = self.model(input_ids, labels=labels, attention_mask=attention_mask)   #
    loss = outputs[0]           # 获取每个token的logit输出结果
    if torch.cuda.device_count() > 1:
        loss = loss.mean()`

是在step位置的输入

wzzzd commented 2 years ago

这是训练时候输入的数据 输入的input_ids:[[101, 865, 2001, 2356, 1920, 7391, 7252, 2339, 689, 1736, 1277, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 102] 输入的attent_mask:[[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1], 输入的labels:[[20, 16, 38, 9, 4, 6, 6, 6, 6, 6, 48, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20]

这个是Trainer.py文件248行的输入吗? (outputs = self.model(input_ids, labels=labels, attention_mask=attention_mask))

`    def step(self, bs):
    """
    每一个batch的训练过程/步骤
    """
    # 输入
    input_ids = bs[0]
    attention_mask = bs[1]
    labels = bs[2]

    # 定义loss,并训练
    outputs = self.model(input_ids, labels=labels, attention_mask=attention_mask)   #
    loss = outputs[0]           # 获取每个token的logit输出结果
    if torch.cuda.device_count() > 1:
        loss = loss.mean()`

是在step位置的输入

标签映射过程出问题了,根据你发的预测标签和输入label,在标签index上对应不上

预测label:[['O', 'B-prov', 'I-prov', 'E-prov', 'B-city', 'I-city', 'E-city', 'B-district', 'E-district', 'B-road', 'I-road', 'E-road', 'O', 输入labels:[[20, 16, 38, 9, 4, 6, 6, 6, 6, 6, 48, 20, 20

StrongBirds commented 2 years ago

20, 16, 38, 9, 4, 6, 6, 6, 6, 6, 48,

输入的label和预测的label他们不是同一个文本

wzzzd commented 2 years ago

建议找同一个文本看看,原始标签,和转换成index后的数据,看看是否正确

StrongBirds commented 2 years ago

建议找同一个文本看看,原始标签,和转换成index后的数据,看看是否正确 最关键是前面的文本都识别对了,只有到最后一个字和后后面的pad会识别错,这个就让我很难理解?

wzzzd commented 2 years ago

1.你用的是什么模型? 2.训练样本有多少行记录,其中带有标签的行数占比有多少?

StrongBirds commented 2 years ago

1.模型是bert+crf 2.训练样本两万条地址数据,均带有标签

StrongBirds commented 2 years ago

1.你用的是什么模型? 2.训练样本有多少行记录,其中带有标签的行数占比有多少?

建议找同一个文本看看,原始标签,和转换成index后的数据,看看是否正确 问题已解决,是crf问题,我用了torchcrf之后就正常了

khazic commented 1 year ago

老哥 我跟你一样出现这个问题 你说用torchcrf就好了 具体代码怎么修改呢!! 着急 感谢

khazic commented 1 year ago

老哥 我把CRF改成from torchcrf import CRF了 然后他就报了这个错 这部分怎么改呀 RuntimeError: where expected condition to be a boolean tensor, but got a tensor with dtype Long

khazic commented 1 year ago

老哥 我好像改好了 我跑一下结果 出结果了还有问题我再问你哈

khazic commented 1 year ago

老哥 验证脚本报错 outputs = outputs[1] IndexError: tuple index out of range

这是啥问题啊 解答一下呢

khazic commented 1 year ago

大佬 我改了bert_crf的crf 变成下面的torchcrf了 trainer文件我也改了 但是在训练完一个epoch之后的eval中 那个文件报错了 说output[1]超长了

StrongBirds commented 1 year ago

大佬 我改了bert_crf的crf 变成下面的torchcrf了 trainer文件我也改了 但是在训练完一个epoch之后的eval中 那个文件报错了 说output[1]超长了

解决了没?

khazic commented 1 year ago

没有!能跟我说一下吗!加个微信可以吗 15608239997

khazic commented 1 year ago

我现在改了torch 博主不是也该过那个代码吗 我就换成他注释的那部分crf代码了BertCrf了 train.py也对应改了import 但是eval.py的output那里报错了

khazic commented 1 year ago

老哥 我着急 能指导一下吗?

khazic commented 1 year ago

方便吧 能说一下怎么解决这个吗

khazic commented 1 year ago

大佬 我改了bert_crf的crf 变成下面的torchcrf了 trainer文件我也改了 但是在训练完一个epoch之后的eval中 那个文件报错了 说output[1]超长了

解决了没?

哥 方便说一下怎么解决吗