thunlp / KACC

KACC: A Multi-task Benchmark for Knowledge Abstraction, Concretization and Completion
14 stars 1 forks source link

Multi-hop reasoning dataset #4

Open chrislouis0106 opened 2 years ago

chrislouis0106 commented 2 years ago

Hi, for the 2(3)-hop-ins(sub)-triples datasets, could you give a detail explanation about the 2-hop path or 3-hop path. Based on the raw file, I'm very confused how to get a multi-hop inference dataset or paths. Thank you.

jayzzhou-thu commented 2 years ago

Hi, thanks for your attention to our work.

The 2(3)-hop-ins(sub)-triples are constructed in this way:

  1. For example, if there are (e1, sub, e2) and (e2, sub, e3) in the KG, we add the triple (e1, sub, e3) in the candidate set.
  2. For all triples in the candidate set, we ask annotators to check their validity and remove triples with semantic drift, as shown in our paper.

So finally in our datasets, we provide triples inferred through multi-hop paths. The paths to construct these triples are omitted. You can find them in the one-hop-triple files such as cross-triples.txt and cpt-triples.txt.

chrislouis0106 commented 2 years ago

Thank you, I got that. And, I'm not sure of which datasets are used to build the multi-hop triples, so do you suggest giving a clear explanation, my guess is that the cross-triples, cpt-triples and ent-triples are used.

jayzzhou-thu commented 2 years ago

Hi, you should use cpt-triples and cross-triples to extract the paths. But there may have multiple paths between two instances. For your convenience, I upload the paths we used in the annotation process, you can match the paths with multi-hop triples.

For multi-hop instanceOf triples, the relations in paths should be (instanceOf, subclassOf, subclassOf). For multi-hop subclassOf triples, the relations in paths should be (subclassOf, subclassOf, subclassOf).

paths.txt

chrislouis0106 commented 2 years ago

Yeah, I know that, thank you. And then, according to the paths.txt, I can find the 2hop paths by extracting from the cpt-triples and cross-triples, but I'm confused how to differ the multi-hop instanceOf triples or subclassOf triples and to get the 3hop paths.

jayzzhou-thu commented 2 years ago

Hi, it could be easier if you use paths.txt and 2(3)-hop-ins(sub)-triples.txt.

For example the first triple in 3-hop-ins-triples.txt in KACC-S (Q648666, P31, Q327333), you can match it from paths.txt and find the path (Q648666, Q15711797, Q192350, Q327333).

You can find corresponding 2-hop or subclassOf triples by find them from different 2(3)-hop-ins(sub)-triples files.