NCATS-Gamma / robokop

Master UI for ROBOKOP
MIT License
16 stars 3 forks source link

Query will not run in ROBOKOP but is executable in ROBOKOP KG #495

Open karafecho opened 4 years ago

karafecho commented 4 years ago

I attempted to use ROBOKOP to answer the MCAT question below.

MCAT Question Steatorrhea is the presence of increased fat in feces. Which organ is least likely to be the cause of a patient’s steatorrhea?

Correct: Stomach

Incorrect: Pancreas; Small intestine; Liver

Initially, this question was problematic due to a lack of identifiers for stomach, pancreas, and liver; since fixed on 2/13/2020. However, the query still does not run, even when I do not specify "anatomical entity".

image

cbizon commented 4 years ago

A little more information. Here's a robokop query that returns no answers:

https://robokop.renci.org/q/3692d7a6-83c1-4c5b-8641-66d718a55f8a/

But, if I run the query in robokopdb2 (I think the live one?) then I get answers:

match (p:phenotypic_feature {id:"HP:0002570"})<-[:has_phenotype]-(d:disease)-[:located_in]->(a:anatomical_entity) return * limit 10

patrickkwang commented 4 years ago

Perhaps some strange synonymization is happening here. This is the query graph:

  {
    "nodes": [
      {
        "id": "n0",
        "type": "disease_or_phenotypic_feature",
        "set": false,
        "name": "Steatorrhea",
        "curie": [
          "HP:0002570"
        ]
      },
      {
        "id": "n1",
        "type": "disease",
        "set": false
      },
      {
        "id": "n2",
        "type": "anatomical_entity",
        "set": false
      }
    ],
    "edges": [
      {
        "type": [
          "has_phenotype"
        ],
        "id": "e0",
        "source_id": "n1",
        "target_id": "n0"
      },
      {
        "type": [
          "located_in"
        ],
        "id": "e1",
        "source_id": "n1",
        "target_id": "n2"
      }
    ]
  }

and this is the derived Cypher query:

MATCH (n1:disease {})-[e0]->(n0:disease_or_phenotypic_feature {})
USING INDEX n0:disease_or_phenotypic_feature(id)
WHERE (n0.id = 'MONDO:0001075') AND (type(e0) = "has_phenotype")
MATCH (n1)-[e1]->(n2:anatomical_entity {})
WHERE (type(e1) = "located_in")
WITH n0.id as n0, n1.id as n1, n2.id as n2, collect(distinct e0.id) as e0, collect(distinct e1.id) as e1
RETURN {n0:n0, n1:n1, n2:n2} as nodes, {e0:e0, e1:e1} as edges
cbizon commented 4 years ago

Hmm. @YaphetKG are we by any chance using the production redis server for the new robokop mini- crawls?

YaphetKG commented 4 years ago

we are using another instance for those,

karafecho commented 4 years ago

Thanks, all.

Note that I kind of need this issue resolved sooner rather than later so that I can complete my second MCAT 'experiment' by the end of the month and include the results on a poster that I am preparing for the AMIA 2020 Informatics Summit (you all are authors, btw). Poster will then be turned into a journal article. (Think: incentive.) :-)

However, I don't want to distract from other activities, so if the issue will require too much time to resolve, then I'll develop Plan B.

karafecho commented 4 years ago

@patrickkwang @YaphetKG @cbizon : Any status updates on this issue?

cbizon commented 4 years ago

It's going to take some time to fix. Can you try using the MONDO term MONDO:0001075 for steatorrhea instead?

karafecho commented 4 years ago

I had already tried the MONDO ID.

I've given up on the original query, at least for the time being. I am probably going to use this answer list. Thoughts?

karafecho commented 4 years ago

But check out what happens when I add the predicate "gene_to_expression_site_association".

cbizon commented 4 years ago

I had already tried the MONDO ID.

And what happened in that case? Still no results?

karafecho commented 4 years ago

Sorry, yes, the MONDO ID yielded no results. I tried a few query permutations with both the MONDO ID and the HP ID---nothing ran.

But the newly structured query works: MONDO 0001075 - gene - anatomical entity.