The way to get triples - Githubissues

RichardHGL / WSDM2021_NSM

Improving Multi-hop Knowledge Base Question Answering by Learning Intermediate Supervision Signals. WSDM 2021.

132 stars 22 forks source link

The way to get triples #5

Closed novice7 closed 3 years ago

novice7 commented 3 years ago

您好！请问下您在进行pagerank之前,是如何获得Freebase三元组的？

请问有方法获得（实体，关系，值）的三元组吗，目前是只有（实体，关系，实体）的三元组吗

RichardHGL commented 3 years ago

我们首先是获得了话题实体（问题提及实体）周围两跳的三元组然后去运行PageRank。当前包含部分（实体，属性，值）这样的三元组，我们没有做特别区分

Question 1: How to get Freebase triples before Pagerank. Answer: Firstly, we filter neighborhood triples around topic entities and then we run Pagerank to reserve entities.

Question 2: Is there (entity, attribute, value) triples here? Answer: Yes.

novice7 commented 3 years ago

那想请问下freebase三元组数据您是从哪获得呢，因为现在freebase已经停止服务了

不好意思哈， entities.txt中确实包含值

RichardHGL commented 3 years ago

Full Freebase Dump can be downloaded from here https://developers.google.com/freebase/.

In this paper, we use Freebase dump downloaded from Microsoft You can use this command: wget https://download.microsoft.com/download/A/E/4/AE428B7A-9EF9-446C-85CF-D8ED0C9B1F26/FastRDFStore-data.zip --no-check-certificate. After downloading it, use fb_en.txt which contains triples in English. For subgraph extraction, you can also refer to https://github.com/OceanskySun/GraftNet. Later, I may give a more detailed process about preprocessing datasets in this repo. Maybe one or two weeks later.

novice7 commented 3 years ago

Do you know the faster way to get the real entity name through the entity MID ? Is there the only way to get the real entity name by getting Freebase/Wikidata Mappings first from https://developers.google.com/freebase/, and then searching the wiki address ?

RichardHGL commented 3 years ago

You can find the entity name through type.object.name attribute. For more detail about Freebase usage, I suggest that you can search it online.