NCATSTranslator / ReasonerAPI

NCATS Biomedical Translator Reasoners Standard API
34 stars 28 forks source link

How to encode the knowledge map? #27

Closed edeutsch closed 5 years ago

edeutsch commented 5 years ago

The proposed knowledge_map scheme was defined like this: KMap: description: "Map from question node and edge IDs to knowledge-graph entity identifiers and relationship references" type: object additionalProperties: type: string

And an example was provided like this: [ { "n00": "MONDO:0005737", "n01": "HGNC:17770", "e00": "316" }, { "n00": "MONDO:0005737", "n01": "HGNC:13236", "e00": "320" } ] (hopefully GitHib issue tracker doesn't make a mess of this JSON)

This strikes me as difficult to document validate. As a counter-proposal, I modeled it like this instead: NodeEdgeNodeTriple: type: "object" description: "A node-edge-node triple binding a result to both the QueryGraph and KnowledgeGraph" properties: nodeA_QG_node_id: type: "string" example: "n00" description: "QueryGraph internal identifier for a QNode" nodeA_KG_id: type: "string" example: "OMIM:603903" description: "CURIE identifier for this node in the KnowledgeGraph" nodeB_QG_node_id: type: "string" example: "n01" description: "QueryGraph internal identifier for a QNode" nodeB_KG_id: type: "string" example: "UNIPROT:P01234" description: "CURIE identifier for this node in the KnowledgeGraph" edge_QG_edge_id: type: "string" example: "e00" description: "QueryGraph internal identifier for a QEdge" edge_KG_id: type: "string" example: "553903" description: "Local identifier for this node which is unique within this KnowledgeGraph, and perhaps within the source reasoner's knowledge graph" additionalProperties: true

This is more explicit and easily documented and validated. But it is a lot more verbose.

What do you think? Go for compact or go for explicit?

patrickkwang commented 5 years ago

My understanding from reading through the yaml is that each Result would have a list of these NodeEdgeNodeTriples, but it's not clear to me how these can be used to describe a general answer. We should be able to use the information provided by this question/KG/binding framework to reproduce e.g. the Result's result_graph.

The compact example above describes two independent answers. These answers happen to contain exactly two nodes and one edge, but will in general have an arbitrary number of nodes and edges corresponding to the elements in the question graph.

edeutsch commented 5 years ago

ah, I see, I misunderstood your example. So, in your example, a single 2-hop answer would look like this: [ { "n00": "MONDO:0005737", "n01": "HGNC:17770", "e00": "316", "n02": "CHEMBL.COMPOUND:CHEMBL112", "e01": "48593" } ]

? In my version this would be two triples with n01 repeated. The same information is there, but a lot more pedantic.

Now that I understand your example, Ah, I suppose this knowledge_map is really just a lookup table. The basis in the query_graph, and all you use the knowledge_map for is a pure lookup table to map the query_graph into the this answer instance by binding the query_graph entities to specific instances in knowledge_graph.

Okay, with that new understanding that knowledge_map is just a lookup table, then the way you had it was better. I get it now. I think?

patrickkwang commented 5 years ago

Yes, this all sounds correct.

edeutsch commented 5 years ago

So, slightly off-topic question about the knowledge map scheme: Say my question is "Which drugs inhibit COX1?" Answer #1: (aspirin) -- inhibits --> (PTGS1) Answer #2: (loxoprofen) -- is_a --> (NSAID) -- inhibits --> (PTGS1) How do we do that in the knowledge_map scheme?

patrickkwang commented 5 years ago

It will be the reasoner module's job to produce their answers in the format requested by the message. If the question asks for (n0)-[inhibits]->(PTGS1), then the reasoner will need to internally infer that (loxoprofen)-[is_a]->(NSAID)-[inhibits]->(PTGS1) implies (loxoprofen)-[inhibits]->(PTGS1).

edeutsch commented 5 years ago

okay, fair enough. I will just revert the schema to what Patrick originally had then: KMap: description: "Map from question node and edge IDs to knowledge-graph entity identifiers and relationship references" type: object additionalProperties: type: string

A simple lookup table Dict. Unless anyone else has comments on this aspect?

edeutsch commented 5 years ago

YAML and pptx updated with latest consensus. Further comments?

edeutsch commented 5 years ago

Since no further comments, closing this issue.