Closed dkoslicki closed 1 month ago
I opened a Cypher session on kg2canonicalized.rtx.ai
, which contains KG2.7.5c, and ran the following Cypher query:
match (n)-[r:`biolink:subclass_of`]->(m) where n.id =~ 'UBERON:.*' and m.id =~ 'UBERON:.*' return count(*);
and it returned 24,639. So it appears that there are 24,639 UBERON-[subclass_of]->UBERON
type edges in KG2.7.5c. Here is an example:
match (n)-[r:`biolink:subclass_of`]->(m) where n.id =~ 'UBERON:.*' and m.id =~ 'UBERON:.*' return n.id, r.predicate, r.knowledge_source, m.id limit 10;
returning:
n.id | r.predicate | r.knowledge_source | m.id
-- | -- | -- | --
"UBERON:0018355" | "biolink:subclass_of" | ["infores:uberon"] | "UBERON:0000022"
"UBERON:0008293" | "biolink:subclass_of" | ["infores:uberon"] | "UBERON:0000022"
"UBERON:0008292" | "biolink:subclass_of" | ["infores:uberon"] | "UBERON:0000022"
"UBERON:0008291" | "biolink:subclass_of" | ["infores:genepio", "infores:uberon"] | "UBERON:0000022"
"UBERON:0034930" | "biolink:subclass_of" | ["infores:uberon"] | "UBERON:0000022"
"UBERON:0014480" | "biolink:subclass_of" | ["infores:uberon"] | "UBERON:0000022"
"UBERON:0018688" | "biolink:subclass_of" | ["infores:uberon"] | "UBERON:0000022"
"UBERON:0018538" | "biolink:subclass_of" | ["infores:uberon"] | "UBERON:0000022"
"UBERON:0018539" | "biolink:subclass_of" | ["infores:uberon"] | "UBERON:0000022"
"UBERON:0018537" | "biolink:subclass_of" | ["infores:uberon"] | "UBERON:0000022"
So this seems to be not an issue with KG2c, but rather, perhaps an issue with the RTX-KG2 API or PloverDB perhaps? I am tagging @amykglen in the hopes that she can weigh in. If it is RTX-KG2 API or PloverDB, in that case I would vote to transfer this issue to the RTX repo issue tracker.
yes, this is something we need to do with Plover. when we implemented subclass_of reasoning we only did it for the more common kinds of pinned query nodes (drugs, diseases), which seemed sufficient early on, but now we need to expand that. so I agree this issue can be transferred to the RTX repo.
I'll be addressing this soon at the same time as #1812
try to fix for end of Sprint 6? @amykglen
this is live on KG2 Plover CI! for instance, submitting this query for molecular activities related to 'exocrine gland' to kg2cploverdb.ci.transltr.io returns results involving 'exocrine gland' but also 'liver':
{
"edges": {
"e00": {
"object": "n01",
"predicates": [
"biolink:related_to"
],
"subject": "n00"
}
},
"nodes": {
"n00": {
"ids": [
"UBERON:0002365"
]
},
"n01": {
"categories": [
"biolink:MolecularActivity"
]
}
}
}
closing
From the Relay, it appears RTX-KG2 is not doing subclass inference for UBERON