请教关于lcf_atepc.py相关细节？

zhhhzhang commented 4 years ago

def forward(self, input_ids_spc, token_type_ids=None, attention_mask=None, labels=None, polarities=None, valid_ids=None, attention_mask_label=None):

    if not self.args.use_bert_spc:  # args.use_bert_spc=false
        input_ids_spc = self.get_ids_for_local_context_extractor(input_ids_spc)
        labels = self.get_batch_token_labels_bert_base_indices(labels)

    global_context_out, _ = self.bert(input_ids_spc, token_type_ids, attention_mask)  
    polarity_labels = self.get_batch_polarities(polarities)

问题1： use_bert_spc = False，已经将convert BERT-SPC input to BERT-BASE format，如果一个句子包含了多个aspect，但是 BERT-BASE format得到的句子表示都是一样的，在ATE任务时，怎么区分出不同的aspect？ self.bert(input_ids_spc, token_type_ids, attention_mask) 但是此处的token_type_ids, attention_mask，并没有做任何修改，可以在计算attention的时候包含了第一个[SEP]后面的aspect的信息？可以这么理解吗？，

问题2： readme.md中，BERT-SPC不能被用于训练和测试ATE任务，这句话怎么理解？

问题3：在做ATE任务的时候，为什么labels=["O", "B-ASP", "I-ASP", "[CLS]", "[SEP]"]，为什么要加上cls和sep？以及为什么num_labels = len(label_list) + 1 不太理解？

问题4： report = classification_report(y_true, y_pred, digits=4) tmps = report.split() ate_result = round(float(tmps[7]) * 100, 2)

report 是各个类别评价指标，这里为什么只取tmps[7])？只取第一个类别？这段代码该怎么理解？不好意思，我看错了 from seqeval.metrics 这个函数直接算的就是f1？

问题有点多，还请不吝赐教！谢谢啦

yangheng95 commented 4 years ago

你好

问题1和问题2: 首先，如README所述，考虑ATE子任务的可靠性问题，不会使用BERT-SPC的输入形式具体原因是BERT-SPC要求在句尾添加aspect，但联合训练ATE时不能提供aspect，因为ATE的目的是标注aspect。因此[SEP]后不会有额外的aspect，即考虑ATE时只采用BERT-BASE的输入形式。而在做ATE时，会将所有的aspect一视同仁，不加区别。但是APC任务会考虑具体的apesct。

问题3: 标注aspcet时除了"O", "B-ASP", "I-ASP",三个标签，还考虑到了[CLS]，[SEP]两个标签，这个属于方案设计问题，也可以将[CLS]，[SEP]以“O”代替。

问题4：该评估代码会输出计算F1的matrices，ate_result = round(float(tmps[7]) * 100, 2)的作用是取出matrices中的F1值（tmps[7])，你可以尝试直接打印report查看。

zhhhzhang commented 4 years ago

嗯嗯！感谢您及时耐心的回答！有几个细节还是不太明白，想再追问一下；

如您所述，在做ATE时，会将所有的aspect一视同仁，不加区别。但是global_contextout, = self.bert(input_ids_spc, token_type_ids, attention_mask) attention_mask中aspect那部分并没有置0，在计算attention的时候会不会有问题？不知道是不是我理解错了？
还有想问一下关于标签设计的问题，为什么num_labels = len(label_list) + 1？以及增加标签的类别会对结果有什么影响么？对于标注方式，我没有什么经验？

不胜感激！

------------------ 原始邮件 ------------------ 发件人: "YangHeng"<notifications@github.com>; 发送时间: 2020年3月4日(星期三) 晚上7:22 收件人: "yangheng95/LCF-ATEPC"<LCF-ATEPC@noreply.github.com>; 抄送: "zhhhzhang"<zhhhzhang@foxmail.com>;"Author"<author@noreply.github.com>; 主题: Re: [yangheng95/LCF-ATEPC] 请教关于lcf_atepc.py相关细节？ (#9)

你好

问题1和问题2: 首先，如README所述，考虑ATE子任务的可靠性问题，不会使用BERT-SPC的输入形式具体原因是BERT-SPC要求在句尾添加aspect，但联合训练ATE时不能提供aspect，因为ATE的目的是标注aspect。因此[SEP]后不会有额外的aspect，即考虑ATE时只采用BERT-BASE的输入形式。而在做ATE时，会将所有的aspect一视同仁，不加区别。但是APC任务会考虑具体的apesct。

问题3: 标注aspcet时除了"O", "B-ASP", "I-ASP",三个标签，还考虑到了[CLS]，[SEP]两个标签，这个属于方案设计问题，也可以将[CLS]，[SEP]以“O”代替。

问题4：该评估代码会输出计算F1的matrices，ate_result = round(float(tmps[7]) * 100, 2)的作用是取出matrices中的F1值tmps[7]。

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

yangheng95 commented 4 years ago

没有问题，不需要刻意避免aspect在注意力中的训练。+1是考虑到可能存在的填充，因为相关代码中标注tag时从1开始https://github.com/yangheng95/LCF-ATEPC/blob/1aecbc541eeb7b44d887b151a149bea8f870382c/utils/data_utils.py#L192) 这个参考了大多数BERT模型做序列标注中的做法。

zhhhzhang commented 4 years ago

好的！多谢多谢！

------------------ 原始邮件 ------------------ 发件人: "YangHeng"<notifications@github.com>; 发送时间: 2020年3月4日(星期三) 晚上7:45 收件人: "yangheng95/LCF-ATEPC"<LCF-ATEPC@noreply.github.com>; 抄送: "zhhhzhang"<zhhhzhang@foxmail.com>;"Author"<author@noreply.github.com>; 主题: Re: [yangheng95/LCF-ATEPC] 请教关于lcf_atepc.py相关细节？ (#9)

没有问题，不需要刻意避免aspect在注意力中的训练。+1是考虑到可能存在的填充，这个参考了大多数BERT模型做序列标注中的做法。

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

zhhhzhang commented 4 years ago

打扰了，还想再问一个细节问题：

对于y_true： temp_1.append(label_map.get(label_ids[i][j], 'O')) 这个地方为什么用O替换？对于y_pred： temp_2.append(label_map.get(ate_logits[i][j], 1)) 这个地方为什么用1替换？

为什么预测值会出现None，出现这种情况分别用O和1替换合理么？是不是都用O或者都用1好一些？

谢谢！

------------------ 原始邮件 ------------------ 发件人: "YangHeng"<notifications@github.com>; 发送时间: 2020年3月4日(星期三) 晚上7:45 收件人: "yangheng95/LCF-ATEPC"<LCF-ATEPC@noreply.github.com>; 抄送: "zhhhzhang"<zhhhzhang@foxmail.com>;"Author"<author@noreply.github.com>; 主题: Re: [yangheng95/LCF-ATEPC] 请教关于lcf_atepc.py相关细节？ (#9)

没有问题，不需要刻意避免aspect在注意力中的训练。+1是考虑到可能存在的填充，这个参考了大多数BERT模型做序列标注中的做法。

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

yangheng95 commented 4 years ago

这两行代码在先前提交的版本中已经被修改，默认都替换为‘O’，请合并最新代码。经实验，替换为1或者’O‘不会直接影响模型效果。这两行代码。最开始模型输出的预测标签可能不在标签映射表中，例如输出0，而标签映射表从1开始映射。

zhhhzhang commented 4 years ago

好的！多谢啦

------------------ 原始邮件 ------------------ 发件人: "YangHeng"<notifications@github.com>; 发送时间: 2020年3月5日(星期四) 晚上11:29 收件人: "yangheng95/LCF-ATEPC"<LCF-ATEPC@noreply.github.com>; 抄送: "zhhhzhang"<zhhhzhang@foxmail.com>;"Author"<author@noreply.github.com>; 主题: Re: [yangheng95/LCF-ATEPC] 请教关于lcf_atepc.py相关细节？ (#9)

这两行代码在先前提交的版本中已经被修改，默认都替换为‘O’，请合并最新代码。经实验，替换为1或者’O‘不会直接影响模型效果。这两行代码。最开始模型输出的预测标签可能不在标签映射表中，例如输出0，而标签映射表从1开始映射。

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

yangheng95 / LCF-ATEPC

请教关于lcf_atepc.py相关细节？ #9