RTXteam / RTX-KG2

Build system for the RTX-KG2 biomedical knowledge graph, part of the ARAX reasoning system (https://github.com/RTXTeam/RTX)
MIT License
38 stars 8 forks source link

No DOID nodes connected with treats edges? #96

Open finnagin opened 3 years ago

finnagin commented 3 years ago

It's possible I'm misunderstanding some recent changes to KG2 or the biolink model here but when I run the following query in the KG 2.6.7 neo4j:

match (n:`biolink:Disease`)-[r:`biolink:treats`]-(m) where n.id starts with 'DOID' return distinct n.id limit 50

I get (no changes, no records)

Why don't we have any doid nodes connected by treats edges?

ecwood commented 3 years ago

It looks like there's a limited number of predicates on edges that link to a DOID node. This is from KG2.7.0's Neo4j:

match (n)-[r]-(m) where n.id starts with 'DOID' return r.predicate, r.relation, r.provided_by, count(distinct r) order by count(distinct r) desc
r.predicate r.relation r.provided_by count(distinct r)
"biolink:gene_associated_with_condition" "JensenLab:associated_with" ["JensenLab:"] 718330
"biolink:same_as" "MONDO:equivalentTo" ["OBO:mondo.owl"] 18000
"biolink:close_match" "oboFormat:xref" ["OBO:mondo.owl"] 15118
"biolink:related_to" "REACT:linked_to_disease" ["identifiers_org_registry:reactome"] 14991
"biolink:subclass_of" "rdfs:subClassOf" ["OBO:doid.owl"] 13421
"biolink:close_match" "oboFormat:xref" ["EFO:efo.owl"] 5207
"biolink:contributes_to" "IDO:0000664" ["OBO:doid.owl"] 2351
"biolink:close_match" "oboFormat:xref" ["OBO:doid.owl"] 1239
"biolink:has_phenotype" "OBO:doid#has_symptom" ["OBO:doid.owl"] 848
"biolink:located_in" "RO:0001025" ["OBO:doid.owl"] 628
"biolink:derives_from" "OBO:doid#derives_from" ["OBO:doid.owl"] 373
"biolink:causes" "RO:0004019" ["OBO:doid.owl"] 253
"biolink:subclass_of" "rdfs:subClassOf" ["OBO:doid.owl", "OBO:genepio.owl"] 241
"biolink:has_phenotype" "RO:0002200" ["OBO:doid.owl"] 183
"biolink:contributes_to" "RO:0003304" ["OBO:doid.owl"] 181
"biolink:causes" "RO:0001022" ["OBO:doid.owl"] 124
"biolink:related_to" "RO:0002451" ["OBO:doid.owl"] 74
"biolink:close_match" "oboFormat:xref" ["umls_source:HPO"] 33
"biolink:subclass_of" "rdfs:subClassOf" ["OBO:genepio.owl"] 19
"biolink:subclass_of" "rdfs:subClassOf" ["EFO:efo.owl"] 16
"biolink:coexists_with" "SO:has_origin" ["OBO:doid.owl"] 3
"biolink:has_participant" "RO:has_participant" ["EFO:efo.owl"] 2
"biolink:coexists_with" "RO:0002220" ["OBO:doid.owl"] 2
"biolink:located_in" "EFO:0000784" ["EFO:efo.owl"] 1
"biolink:related_to" "IAO:0000136" ["EFO:efo.owl"] 1
"biolink:related_to" "OBI:0001927" ["OBO:genepio.owl"] 1
Filtering out those not from DOID: r.predicate r.relation r.provided_by count(distinct r)
"biolink:subclass_of" "rdfs:subClassOf" ["OBO:doid.owl"] 13421
"biolink:contributes_to" "IDO:0000664" ["OBO:doid.owl"] 2351
"biolink:close_match" "oboFormat:xref" ["OBO:doid.owl"] 1239
"biolink:has_phenotype" "OBO:doid#has_symptom" ["OBO:doid.owl"] 848
"biolink:located_in" "RO:0001025" ["OBO:doid.owl"] 628
"biolink:derives_from" "OBO:doid#derives_from" ["OBO:doid.owl"] 373
"biolink:causes" "RO:0004019" ["OBO:doid.owl"] 253
"biolink:subclass_of" "rdfs:subClassOf" ["OBO:doid.owl", "OBO:genepio.owl"] 241
"biolink:has_phenotype" "RO:0002200" ["OBO:doid.owl"] 183
"biolink:contributes_to" "RO:0003304" ["OBO:doid.owl"] 181
"biolink:causes" "RO:0001022" ["OBO:doid.owl"] 124
"biolink:related_to" "RO:0002451" ["OBO:doid.owl"] 74
"biolink:coexists_with" "SO:has_origin" ["OBO:doid.owl"] 3
"biolink:coexists_with" "RO:0002220" ["OBO:doid.owl"] 2

As to why we don't have DOID biolink:treats edges, I'm still looking into that.

ecwood commented 3 years ago

Below are the relation IRIs present in DOID: