duterscmy / ccks2019-ckbqa-4th-codes

中文知识库问答代码,CCKS2019 CKBQA评测第四名解决方案
479 stars 91 forks source link

tuple_extractor.py #26

Open Lanme opened 4 years ago

Lanme commented 4 years ago

tuple_extractor.py这个函数抽取出来的candidates_tuple如下:{('<猎国莫妮卡>', '<性别>', '<词性>'): ['<猎国莫妮卡>', '莫妮卡', 0.00027995862]}, 导致tuple_filter.py里面的X.append([features[9][0][1]])索引出错。这里 score = [entity]+[s for s in candidate_entitys[entity][0:1]]是不是写错了?

def extract_tuples(self,candidate_entitys,question): '''''' candidate_tuples = {}

    for entity in candidate_entitys:
        #得到该实体的所有关系路径
        starttime=time.time()

        relations = GetRelationPaths(entity)

        mention = candidate_entitys[entity][0]
        for r in relations:

            this_tuple = tuple([entity]+r)#生成候选tuple
            predicates = [relation[1:-1] for relation in r]#python-list 关系名列表

            human_question = '的'.join([mention]+predicates)

            score = [entity]+[s for s in candidate_entitys[entity][0:1]]#初始化特征

            try:
                sim2 = self.sentencepair2sim[question+human_question]
            except:
                sim2 = self.simmer.predict(question,human_question)[0][1]
                self.sentencepair2sim[question+human_question] = sim2
            self.sentencepair2sim[question+human_question] =sim2
            score.append(sim2)

            candidate_tuples[this_tuple] = score
        print ('====查询候选关系并计算特征耗费%.2f秒===='%(time.time()-starttime))

    return candidate_tuples
duterscmy commented 4 years ago

不好意思,这两个文件里特征不太一致,tuple_extractor.py里只考虑了bert相似度特征是最后的版本。应该是将tuple_filter.py改成X.append([features[2]]),features[2]也就是['<猎国莫妮卡>', '莫妮卡', 0.00027995862]中的分数。

------------------ 原始邮件 ------------------ 发件人: "Lanme"<notifications@github.com>; 发送时间: 2020年3月4日(星期三) 下午5:11 收件人: "duterscmy/ccks2019-ckbqa-4th-codes"<ccks2019-ckbqa-4th-codes@noreply.github.com>; 抄送: "Subscribed"<subscribed@noreply.github.com>; 主题: [duterscmy/ccks2019-ckbqa-4th-codes] tuple_extractor.py (#26)

tuple_extractor.py这个函数抽取出来的candidates_tuple如下:{('<猎国莫妮卡>', '<性别>', '<词性>'): ['<猎国莫妮卡>', '莫妮卡', 0.00027995862]}, 导致tuple_filter.py里面的X.append([features[9][0][1]])索引出错。这里 score = [entity]+[s for s in candidate_entitys[entity][0:1]]是不是写错了? def extract_tuples(self,candidate_entitys,question): '''''' candidate_tuples = {} for entity in candidate_entitys: #得到该实体的所有关系路径 starttime=time.time() relations = GetRelationPaths(entity) mention = candidate_entitys[entity][0] for r in relations: this_tuple = tuple([entity]+r)#生成候选tuple predicates = [relation[1:-1] for relation in r]#python-list 关系名列表 human_question = '的'.join([mention]+predicates) score = [entity]+[s for s in candidate_entitys[entity][0:1]]#初始化特征 try: sim2 = self.sentencepair2sim[question+human_question] except: sim2 = self.simmer.predict(question,human_question)[0][1] self.sentencepair2sim[question+human_question] = sim2 self.sentencepair2sim[question+human_question] =sim2 score.append(sim2) candidate_tuples[this_tuple] = score print ('====查询候选关系并计算特征耗费%.2f秒===='%(time.time()-starttime)) return candidate_tuples
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Lanme commented 4 years ago

对了,这个answer_bot.py里面add_props函数special_props没添加任何东西呀

def add_props(self,entity_mention,pred_props): ''' 用entity mention对props做一个补充 '''

补充属性值里带顿号的情况

    subject_props = {}
    subject_props.update(pred_props['mark_props'])
    subject_props.update(pred_props['time_props'])
    subject_props.update(pred_props['digit_props'])
    subject_props.update(pred_props['other_props'])
    subject_props.update(pred_props['fuzzy_props'])

    special_props = {}
    subject_props.update(pred_props['mark_props'])
    subject_props.update(pred_props['time_props'])

    return subject_props,special_props