Closed GX77 closed 3 years ago
如果不是为了发表论文和之前的方法公平的比较,可以注释掉这几行
with open("../wikidata5m_triplet.txt", 'r', encoding='utf-8') as fin:
lines = fin.readlines()
for i in tqdm(range(len(lines))):
line = lines[i]
v = line.strip().split("\t")
if len(v) != 3:
continue
h, r, t = v
if (h, r, t) not in fewrel_triples:
if h in head_cluster:
head_cluster[h].append((r, t))
else:
head_cluster[h] = [(r, t)]
if t in tail_cluster:
tail_cluster[t].append((r, h))
else:
tail_cluster[t] = [(r, h)]
else:
num_del += 1
total += 1
您好 那这些也要注释掉吗?我就只注释了“ with open('../ernie_data/fewrel/test.json', 'r', encoding='utf-8') as fin:”
这些不注释
哦哦,但是我没有找到“wikidata5m_triplet.txt”文件,请问这个文件如何获得?
发自我的iPhone
------------------ 原始邮件 ------------------ 发件人: Tianxiang Sun @.> 发送时间: 2021年10月3日 22:16 收件人: txsun1997/CoLAKE @.> 抄送: GX77 @.>, Author @.> 主题: 回复:[txsun1997/CoLAKE] "wikidata5m_triplet.txt"在哪里? (#14)
"This is to remove FewRel test set from our training data. If your need is not just reproducing the experiments,you can discard this part. The
ernie_data
is obtained from https://github.com/thunlp/ERNIE" 这句注释的意思是:如果想重新训练模型,就只需要屏蔽下面这几行? fewrel_triples = set() ''' with open('../ernie_data/fewrel/test.json', 'r', encoding='utf-8') as fin: fewrel_data = json.load(fin) for ins in fewrel_data: r = ins['label'] h, t = ins['ents'][0][0], ins['ents'][1][0] fewrel_triples.add((h, r, t)) print('# triples in FewRel test set: {}'.format(len(fewrel_triples))) print(list(fewrel_triples)[0]) '''