Closed zhengxiaoxuer closed 4 years ago
font{
line-height: 1.6;
}
ul,ol{
padding-left: 20px;
list-style-position: inside;
}
是这样,因为无论是实体识别还是属性值抽取,都是为了保证实体的召回率够高,不然如果没把候选实体找出来的话,实体链接环节性能再好也是错的。所以我们是分别写了抽mention的方法再合并,这个question2mention就是另一个同学抽出来的实体存成了文件。我当时觉得加了这个程序结构有些乱就没传上来,现在放到data/里了。还有就是你跑完属性值抽取后就到了实体链接,实体链接我们做的一般,你可以参考第一名论文把特征完善一下。
On 11/5/2019 16:37,zhengxiaoxuer<notifications@github.com> wrote:
@duterscmy 在prop_extractor.py 48行这一部分提取属性值,这里为什么要用question2mention来提取实体?
try:
max_props = self.question2mention[QUES][1]
for p in max_props:
mark_props[p] = p
except:
print('this question dont have long props')
pass
—You are receiving this because you were mentioned.Reply to this email directly, view it on GitHub, or unsubscribe.
加我啊,兄弟,我也在复现楼主的代码,15821444815
这个代码文件中有一处:
for x in gold_entitys:
if x[0] == '\"':
gold_props.append(x)
这个逻辑看不懂。。。。为啥引号开头的就是 gold_props,没有引号开头就不是???
因为在pkubase知识库里面,实体是用<>表示的,文本属性值是用双引号表示的,所以用双引号来进行了判断。pkubase知识库的预处理部分不是我写的,代码有些乱,周末整理下再传上来吧。
---Original--- From: "JaonLiu"<notifications@github.com> Date: Tue, Nov 19, 2019 19:19 PM To: "duterscmy/ccks2019-ckbqa-4th-codes"<ccks2019-ckbqa-4th-codes@noreply.github.com>; Cc: "Mention"<mention@noreply.github.com>;"Caomingyu"<1054527636@qq.com>; Subject: Re: [duterscmy/ccks2019-ckbqa-4th-codes] prop_extractor.py (#7)
这个代码文件中有一处:
for x in gold_entitys: if x[0] == '\"': gold_props.append(x)
这个逻辑看不懂。。。。为啥引号开头的就是 gold_props,没有引号开头就不是???
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
@duterscmy 在prop_extractor.py 48行这一部分提取属性值,这里为什么要用question2mention来提取实体? try: max_props = self.question2mention[QUES][1] for p in max_props: mark_props[p] = p except: print('this question dont have long props') pass