NCATS-Gamma / robokop

Master UI for ROBOKOP
MIT License
16 stars 3 forks source link

Question generalization: Node hopping #244

Open cbizon opened 5 years ago

cbizon commented 5 years ago

There's a related set of issues that I'm going to lump under the general heading of query brittleness: If you got no or poor answers because you started with the wrong node.

Maybe you started with a very specific form of a disease (genetic susciptibility to asthma type 32), and you would have gotten answers that you liked fine if you went up to eg. susceptability to asthma or up higher to asthma.

Maybe you started with too general query and the nodes are too blobby, and there are cool answers in some sub-version

Maybe you started with the hpo term for obesity, and there would be better answers to the mondo term for obesity (disease) or to things that are related, but not in a clearly ontological way, like BMI.

Maybe you started with one chemical and got nothing, but there is a very similar chemical where you would get something (like anhydrous codeine vs hydrous codeine).


Going to make a different issue for edge stuff.

cbizon commented 5 years ago

Another example. Will Byrd was asking about a link between a gene and "glutamate". There's a bunch of things that might get lumped under the common term "glutamate". There's glutamic acid, and then there's the ionized version which is (-1)glutamate, and there's also -2 glutamate, and there' also L- and D- forms (stereochemistry). All with different identifiers. There are chebi relationships between a lot of these, so we don't have to rely solely on going through english.

cbizon commented 5 years ago

LD is another excellent example of this.

cbizon commented 5 years ago

Another example from Mark: So, the issue that people were running into is this, if you search for MATCH (x)--(c:chemical_substance) WHERE x.id = 'HP:0002017' RETURN x.name, c.id LIMIT 100 and MATCH (x)--(c:chemical_substance) WHERE x.id = 'HP:0002587' RETURN x.name, c.id LIMIT 100 3:46 PM Which are Nausea and vomiting and Projectile vomiting respectively, you get two sets of results that don't overlap 3:47 PM It was my assumption that robokop was only returning results that had those specific annotations, which makes sense, but the question is this, is there any way to get results for all nausea/vomiting related phenotypes and if not, how difficult would it be to change the underlying graph to leverage the hierarchy of HPO or something similar 3:48 PM It looks like there is a lot of is_a and part_of type relations between anatomical entities, but that doesn't seem to hold true for disease_or_phenotypic_feature 

cbizon commented 5 years ago

Essentially the same as #241

patrickkwang commented 5 years ago

This may be solved by a variant of query 'fuzzing', which we've talked about implementing as a messenger service.