NCATS-Tangerine / tranql

A Translator Query Language
https://researchsoftwareinstitute.github.io/data-translator/apps/tranql
MIT License
8 stars 6 forks source link

Connect kg fix #97

Closed YaphetKG closed 4 years ago

YaphetKG commented 4 years ago

Hi @stevencox ,

I was testing out the previous version of connect_knowledge_map and found some issues such as:

  1. it was treating knowledge map as a list of bindings of edges and nodes and it was trying to fill in the blanks with the knowledge graph. so something like a->b->c , c->d->e would be a linear path of a->b->c->d->e but testing and asserting for answer set like [ a-> b] [b->c] [b->d] [d->e] [c->e] gets a bit complicated and i thought it might be better to construct a graph using the answer set and question graph and try to find paths in the new knowledge graph.
    • In doing so , i found that each sub question that is sent to the knowledge providers by generating a new id so if we have a query like chemical_substance->gene->disease two questions generated would be chemical_substance -[e0]-> gene gene-[e0]-> disease this becomes a problem with the edge ids being the same, so generating an edge id by including the edge's source_id and target_id seems to address this.
    • Another thing in this new connector is that if a Knowledge provider (errors some how and returns no edge bindings for some node bindings , those nodes are ignored). This was present in some of the tests in tranql unit tests. And I've update those test results to be inline with this.
    • I have tried to test this functionality with new test that assert it's functionality with different shapes of graphs (https://github.com/NCATS-Tangerine/tranql/compare/connect_kg_fix?expand=1#diff-f759e475413519d7f06face5c831e21aR1569) .

In summary: the new connector will make a graph of the knowledge map and will try to find paths for each node, if it has not been included in a previous path.