PhilippChr / CONVEX

Code for our CIKM 2019 paper. As far as we know, CONVEX is the first unsupervised method for conversational question answering over knowledge graphs. A demo and our benchmark (and more) can be found at
https://convex.mpi-inf.mpg.de/
MIT License
28 stars 9 forks source link

How to get the Knowledge Graph #13

Closed SRL94 closed 2 years ago

SRL94 commented 2 years ago

Hi, How can I get the knowledge graph for ConvQuestions?

PhilippChr commented 2 years ago

Hi, it seems that the original file has been removed. Please write a mail to pchristm@mpi-inf.mpg.de, to get a copy.

Regards, Philipp

SRL94 commented 2 years ago

Hi Philipp,

We have unzipped the Wikidata dump and gotten two files: wikidata2018_09_11.hdt.index.v1-1 and wikidata2018_09_11.hdt. But we cannot read the content with different text editors. How do you read them? Thank you for your help.

Best regards Sirui

PhilippChr commented 2 years ago

Hi Sirui,

HDT has an efficient representation of the KB, and the stored data is not meant to be directly readable (see https://www.rdfhdt.org). In case you do not want to mess around with huge knowledge bases, and just want to conveniently access relevant facts for a specific entity or question, I can recommend also taking a look at our latest project CLOCQ which will be published as full paper at WSDM2022, for which we will soon have an open API available: https://clocq.mpi-inf.mpg.de.

If you want to look at the plain Wikidata KB-dump, you can download the latest dump (https://dumps.wikimedia.org/wikidatawiki/entities/). We also have a code-base for extracting a QA-related subset from Wikidata (https://github.com/PhilippChr/wikidata-core-for-QA).

SRL94 commented 2 years ago

Hi Philipp,

Thanks for the reply. As for the Wikidata dump that you sent through email (wikidata2018_09_11.hdt.index.v1-1 and wikidata2018_09_11.hdt), is it already extracted to a QA-related subset?

Best regards Sirui

PhilippChr commented 2 years ago

Hi Sirui,

No, this is the full Wikidata dump from that timestamp.

Regards, Philipp

SRL94 commented 2 years ago

Hi Philipp,

Thanks a lot. Could you please explain how to construct the context graph for turn 0? For example, given the question "When did The Carpenters sign with A&M Records?", seed entity "The Carpenters" and the answer "1969", I can identify the answer triple is

http://www.wikidata.org/entity/Q223495
http://www.wikidata.org/prop/direct/P571
"1969-01-01T00:00:00Z"^^<http://www.w3.org/2001/XMLSchema#dateTime>

Does the context graph only contain this triple or also contain triples (The Carpenters, ?, A&M Records)? According to your methodology, my understanding is that the context graph contains only one triple; however, according to Figure 1, my understanding is that the context graph also contains triples (The Carpenters, ?, A&M Records).

Sincerely appreciate your help. Best regards Sirui

PhilippChr commented 2 years ago

Hi Sirui,

The qualifiers are missing in this case. The fact "Carpenters, record label, A&M records" is there in Wikidata (https://www.wikidata.org/wiki/Q223495), but misses the qualifier "point in time, 1969".

In Figure 1, these qualifiers are there.

Regards, Philipp