Closed karafecho closed 2 years ago
One thing I don't think we resolved is what responsibilities the KPs own versus the ARAs. For instance, if ARAXi approaches the ICEES+ team and requests something completely different (or in a different form) than what ARAGORN is interested in, then how shall we respond?
I'm not sure we'll be able to address this issue in GitHub, but I did want to mention it since we didn't have time to discuss the issue yesterday.
I agree with all of this except 5. IMO we should think about the operations that ICEES supports, and I thought that we agreed to 2:
1) Given a KG in a ReasonerAPI message, ICEES adds edges between all nodes in the KG 2) Given a one hop machine query, ICEES returns answers
Those were our two issues #20 and #21 . Does those being closed mean that these things are implemented now?
@bizon is the difference between @karafecho's 5 and your 2 one hop vs applying to the whole graph? Or maybe I'm comparing the wrong things.
5 is a little confusing because it's conflating 2 things: What operation ICEES will provide and what ARAGORN is going to do with it. I think we should write the statements about what ICEES will provide (letting those be informed by what ARAs like ARAGORN will find useful).
I think that what 5 is implying is an operation like my (1). you have a reasonerstd message where there is a knowledge_graph element, and it has some nodes. Then this operation will, for every pair of nodes in that list of nodes, generate 1 or more edges according to @karafecho 's 1-4.
But I also think that there should be an ICEES endpoint that does the "fill in the blank" operation, as long as there is only one blank. That is, an operation that looks into the query_graph section of a reasonerStd message, and provides answers to questions of the form (a)--(b), where (a) is an entity specified with a curie, and b is a type.
Does that make more sense?
It makes sense. I've implemented the overlay api. which is 1 by @cbizon. The 2 is essentially the associations_to_all_features api endpoint, but with some tweaks, which I'll work on after I've deployed 1. Do we need to provide a reasonerStd message based api for that, too?
@cbizon @xu-hao Ok, that all helps. I'd suggest that it is a reasoner standard API since, architecturally, that's what the FOA calls for ARAs to interact with and it will make it a natural endpoint for other ARAs to call.
Any thoughts on a name for @cbizon's 2?
Is overlay the term from araxi for that operation? I think overlay or decorate or something like that...
I think overlay makes more sense for 1, eh? i.e. one has a giant graph and wants to decorate it. But 2 is a question graph w/two nodes and we want all associations between them right?
Sorry, I missed that you were asking for a name for 2. 2 is a question graph like (diabetes)--(drug). i.e. one of the nodes is bound. I guess part of the naming is do you anticipate ever supporting more than 1-hops? LIke a generic graph query interface, and if so, would it be this function or a different one. If yes, then I might call this "graph_query" or something like that, but if no, then I like get_associations (but it is kinda generic)
Right, get_associations is boring but it's my current fav. Not that naming is the most vital piece of this. But numbers get a little hard to follow.
I am admittedly a bit confused, and I'm thinking that a graphic might help clarify things, not just for me, but for others inside/outside of the core team, but I would like to make a few points, fwiw.
Just my two cents...
Yep, I agree with pretty much all of what you say @karafecho . My comment that you mention above is just trying to decouple the operations from how the results of those operations would be used. We tend to think that overlay->support and onehop->pathfinding, but there's no reason those couldn't be inverted.
I'm happy to help with a diagram, but my visual skills are pretty low. I'm imagining that this would be a graphic to explain what each endpoint does?
WRT the graphic, I think we need a conceptual visual that demonstrates overlay -> support, one-hop-> pathfinding and the inverse. That way, we can figure out where ICEES+ endpoints fit in, as well as other KP endpoints, e.g., COHD, and perhaps use the graphic to also illustrate how different ARAs plan to integrate the KPs. Does that make sense and seem reasonable? I don't think we need a pretty graphic, just something.
@cbizon @stevencox @xu-hao : Please comment.
Here's my attempt at what I think would go into a graphic
The first one illustrates the 1 hop function, and the second is the overlay
Left side is input, right side is output, Q, KG, A are the 3 parts of a ReasonerAPI message
There's actually an interesting question on whether the overlay should pay attention to any input answers as well, and only look for pairs that occur in the same answer, but I have not illustrated that here.
This is very helpful. Thanks, @cbizon!
@karafecho @cbizon Thanks. That really clarifies things. With regard to 2, should we also add loops, i.e., edges that connect a node to itself or it is not necessary? The current implementation that I gave Patrick has loops.
Reasoner Standard API specs:
Input (question/query) formats
Output (results) formats
Outdated? (See below)
- Natural language question (NLQ) e.g. “What genes are associated with X?” ambiguous difficult/impossible to parse, in general
- “Query type ID” (QID) e.g. Q0 (“What is X?”), Q22 (WF1MOD2), … closed set - cannot ask new types of questions
These are no longer supported input formats. The query graph is the only standard input format now.
- “dense” format result includes complete nodes and edges sometimes needlessly verbose difficult to align/compare/chain
This is no longer a supported output format. The "bindings" format is the only standard output format now.
Thanks, @patrickkwang. I pulled the input/output info from what appears to be an outdated slide deck. The goal, however, was to more formally define the Q, KG, and A on the left side of Chris' figure, as well as the output on the right.
As you can see, my familiarity with the Reasoner Standard API is only at a high level. My hope is to fill in some of the details, so I can better understand the challenges. Does that make sense?
Reasoner Standard API specs (updated):
Input:
Question/query graph (QG) limited expressiveness“message”/“bindings” format
Output:
“message”/“bindings” format result includes bindings between shared knowledge graph and question/query graph concise compatible with query/question graph (QG) input - can chain modules
I contemplated drawing the figure with bindings, but decided that it overcomplicated the image.
@xu-hao I guess the loops come from when there are more than one features associated with an input concept? I don't think they're necessary, but I also don't think that they hurt anything.
Hao and I (mostly Hao) pulled together the 'inventory' specs found in this Data Modeling doc, which was discussed during today's meeting. Comments are welcome.
I think I'm feeling more comfortable with all of this now and better understanding the details. Certain (minor?) things are still a bit fuzzy, but that's probably my problem.
FYI: Hao, Steve, and I met with David K. and Megha S. (ARAXi Expander Agent). They are going to take a shot at the overlay endpoint and then try the one-hop endpoint after Hao finishes developing it. The queries that I sent previously can be adapted to support the new endpoints. Happy to do that, if that would be helpful.
@cbizon loop can also be between the same features associated with an input concept.
@xu-hao I'm not sure I understand, can you give an example?
@cbizon For example a curie X can map to a feature Y, we can generate a contingency matrix between Y and itself. which will have p-value of 0. This will become and edge between X and itself, which is a loop on X.
I think at the feature level, calculating association of a feature with itself is a bug?
I have it there for completeness. But I can filter it out.
Thanks for meeting yesterday, @xu-hao @cbizon @stevencox @patrickkwang. I've attempted to summarize the agreed-upon plan and next steps below. Please feel free to edit.