NCATSTranslator / ReasonerAPI

NCATS Biomedical Translator Reasoners Standard API
34 stars 28 forks source link

QEdge "length" specification #154

Open patrickkwang opened 4 years ago

patrickkwang commented 4 years ago

We'd like something like Cypher's variable-length paths. On a query graph edge, we could encode this with a property like "length": "*3..5".

However, it is not clear to me how things should be bound to this query graph. In particular, there is no obvious place to bind nodes that occur in the middle of the path, and no way to enforce constraints on those nodes.

One way to constrain this that may make it more interpretable is to allow lengths only on self-edges. So we could encode things like this: (disease)-[subClassOf]->(disease)-[subClassOf]->(disease)- ... which is actually really useful...

@cbizon @balhoff

cbizon commented 4 years ago

One feature of this in cypher that is exceptionally useful is that the minimum length can be zero. So you can say

(a:gene)--(b:biological_process)-[:subClassOf*0..]-(c:biological_process {name:"immune system process"}) and it will return any gene annotation that is a subclass of isp or isp itself.

balhoff commented 4 years ago

SPARQL has property paths, but you can't set a specific length. Only zero or one (?), zero or more (*), or one or more (+).

southalln commented 4 years ago

Can I ask a naive question here? Is TRAPI supposed to be a literal instantiation of a question (I'm looking for something of length 3-5) or a shell that a resource fills in (User: I want drugs to treat diabetes; Response: I can give you drugs that downregulate targets that are overactivated in diabetes and that are appropriate for diabetic patient demographics). I am assuming that most resources will expand/modify the query graph to explain how their result is relevant/important, in which case it seems like all edges in the query graph are replaceable with much more complex structures and a label like length: 3-5 would not be a useful ornament.

amykglen commented 4 years ago

I've been pondering a similar distinction lately: the query graph representing the question vs. the structure of the results. It does seem like we're trying to use it for both... but should they really be represented in the same structure? Do we even need to capture the latter anywhere, or could it be deduced by inspecting a result?

patrickkwang commented 4 years ago

There was some discussion around this question at the ARS session during the Sept 2020 relay meeting. I think query graphs are currently used in both ways: as descriptions of a user's question that can be interpreted in clever ways, and as detailed specifications for exactly what form of result is required. The latter is maybe less useful for ARS->ARA communication, but there are varying opinions on how prescriptive ARA->KP queries ought to be. It might be nice for clients to be able to ask a "strict" question without worrying about server improvisation.

edeutsch commented 4 years ago

Is TRAPI supposed to be a literal instantiation of a question

I would say yes. But we're not yet at the place where we can capture all the richness that an English question can convey in a TRAPI QueryGraph.

User: I want drugs to treat diabetes

But English questions can be vague or open to interpretation as well. In fact, that there isn't even a question. Maybe the correct response is "I am unable to write prescriptions for drugs. But you can ask me questions like:

I wonder if it would be worth our while to take some of our vague starter questions and think about all the various specific semantic variations of that question that we can think of, and once we have that list, dream up ways to convey those specific questions as QueryGraphs (and extensions).

Maybe we are doing ourselves a disservice by encouraging ARAs to overinterpret QueryGraphs. Perhaps QueryGraphs should actually be very specific and narrowly interpreted by ARAs because if each ARA goes off with they own complex interpretation of a simple QueryGraph, the job of the ARS to aggregate those results may be very difficult.

Perhaps we should focus on ways to ask very specific questions and try to have a primary user interface that prompts the user to ask very specific questions. Or do we prefer the paradigm where the user asks a vague question, gets 6 different answers to 6 different specific questions which are all related to but different from the original question (based on the creativity of each of the ARA implementers)? Might the user be intrigued to see 6 different interpretations of their question and eagerly explore all the different ramifications of their vague question? Or will the user simply select the one question/answer that best fits the question in their mind when they asked? Or will the user realize that their question was poorly described and start over with a more specific question, having wasted time and effort by being allowed to ask a vague question?

cbizon commented 4 years ago

Is TRAPI supposed to be a literal instantiation of a question (I'm looking for something of length 3-5) or a shell that a resource fills in (User: I want drugs to treat diabetes; Response: I can give you drugs that downregulate targets that are overactivated in diabetes and that are appropriate for diabetic patient demographics).

IMO, yes :)

In other words, TRAPI defines the syntax of a message. But the semantics of the message are determined by what the service being called chooses to do. And that service should be very clear about its intentions via metadata.

So I could imagine an endpoint that says "I am going to take the question graph as a literal pattern to execute" and I could imagine a different endpoint that says "I am going to take the question graph as a high level user intent, and I am going to figure out how to modify or create a question that is consistent with your question".

This becomes either 2 different operations, or one with parameters, and whoever calls that component can decide which to call based on what they want.

At that point, we still have the questions of: what do we demand of ARAs (implement one or the other or both?). What is the ARS going to call (maybe either at different times based on its own workflow or user specs).

colleenXu commented 3 years ago

Putting on my "user" hat:

I would prefer to see in the UI that I need a specific/answerable question in a format/template, that will be answered literally. I can learn to deal with that template and ask questions in that way that get at what I'm looking for (example queries/answers that can then be adjusted to ask diff questions, and tutorials are super helpful here).

I personally can't get used to a robot with opaque inner workings (think about the "assistants" like Alexa/Cortana/etc. where they say "ask anything" and don't give templates or what specializations these assistants have).

Recall this kind of situation: I ask a vague question like "What's on the dark side of the moon?", and the robot asks "do you mean Pink Floyd's 'Dark side of the moon'? Pink Floyd........." [it took a path that's specific and actually in the robot's scope to reach this kind of answer]. I get frustrated and confused to get an answer that wasn't at all what I wanted (especially if this took time to receive).

Furthermore, I think it would be overwhelming to get multiple different types of answers since the robot took multiple paths without asking me first if that was even relevant to the actual nebulous question in my head.