usc-isi-i2 / kgtk

Knowledge Graph Toolkit
https://kgtk.readthedocs.io/en/latest/
MIT License
355 stars 57 forks source link

Query Documentation: Query all node2 for a given node1, and multiple relations #330

Open mann-brinson opened 3 years ago

mann-brinson commented 3 years ago

Is your feature request related to a problem? Please describe. I need a way to query all possible objects (node2) for a given subject (node1) and multiple relations (label). This would be helpful for users who want to find all possible classes of a given instance.

Example: I want to find all classes (node2) of a subject (node1) called ('amoxicillin', Q201928). A class can be defined as objects of labels ('instance of', P31), and ('subclass of', P279).

Describe the solution you'd like I need a means to produce this in a single kypher query. Below are examples of how this is done using either 1) Cypher or 2) Wikibase CLI.

Describe alternatives you've considered

1. Cypher: Here is an example of this same request in Cypher documenetation. https://neo4j.com/docs/cypher-manual/current/clauses/match/#match-on-multiple-rel-types

2. Wikibase CLI: You can also use the wikibase cli to find classes / subclasses, but kypher must also handle this. https://github.com/maxlath/wikibase-cli

Example: Wikibase CLI ! wd u Q201928

>>id Q201928 >>Label amoxicillin >>Description antibiotic useful for the treatment of a number of bacterial infections >>instance of (P31):  chemical compound (Q11173) | medication (Q12140) | carboxylic acid (Q134856) | penicillin (Q12190) | essential medicine (Q35456) >>subclass of (P279): bactericide (Q804539) | penicillin (Q12190)

Additional context This functionality can be used in downstream tasks for creating subgraphs of wikidata.

dgarijo commented 3 years ago

The query:

kgtk query -i file \
--match 'file: (n1)-[l1 {label:p}]->(n2) \
--return 'distinct n1 as n1, p as label, n2 as n2' \
--where 'p = "P31" OR p = "P279"' \
mann-brinson commented 3 years ago

I have tried this, and can't get a result. Can you confirm running this produces some result?

dgarijo commented 3 years ago

Have you verified you got results fro either? I have not tested the query yet

dgarijo commented 3 years ago

Union queries are not supported at the moment.

mann-brinson commented 3 years ago

Test 1 - Query: Get the class (P31) of an instance

claims = "claims.tsv.gz"

!kgtk query -i {claims} \
--match '(n1)-[:P31]->(n2)' \
--where 'n1 = "Q5451712"' \

Test 1 - Result: Working

[2021-02-10 18:19:06 query]: SQL Translation:
---------------------------------------------
  SELECT *
     FROM graph_1 AS graph_1_c1
     WHERE graph_1_c1."label"=?
     AND (graph_1_c1."node1" = ?)
  PARAS: ['P31', 'Q5451712']
---------------------------------------------
id  node1   label   node2   rank    node2;wikidatatype
Q5451712-P31-Q281-2d4512be-0    Q5451712    P31 Q281    normal  wikibase-item

Test 2 - Query: Get the class (P31) of an instance

!{kypher} -i {claims} \
--match 'claims: (n1)-[l1 {label: p}]->(n2)' \
--where 'n1 = "Q5451712" and p = "P31"' \

Test 2 - Result: Not working /bin/bash: {kypher}: command not found

Test 3 - Query: Get class and subclass of instance

!{kypher} -i {claims} \
--match 'claims: (n1)-[l1 {label: p}]->(n2)' \
--where 'n1 = "Q5451712" and (p = "P31" OR p = "P279")' \

Test 3 - Result: Not working /bin/bash: {kypher}: command not found

mann-brinson commented 3 years ago

@dgarijo any further advice?

dgarijo commented 3 years ago

If I recall correctly, @chalypso said UNION queries are not supported at this time, so the last query is not going to work right now.