NCATS-Gamma / robokop

Master UI for ROBOKOP
MIT License
16 stars 3 forks source link

Ranking leans towards short circuits? #466

Open cbizon opened 4 years ago

cbizon commented 4 years ago

https://robokop.renci.org/a/72fc5b84-58f6-4f46-83ca-c5ee28d7ae31_f246c8bb-a271-4b0d-bd50-567ac8a073f7/

Question is going from glutamate receptors to memory loss using a COP-style query:

image

First answer: image the memory impairment / nervous system / neuron is all fine, but only connects to the gene part of the graph via neuron (not a real specific node)

Fourth answer: image

The left side of the graph is the same. And here the process is much more tightly bound to that side. So that should increase the score. But the gene is less strongly connected. In the original, the gene was more tightly coupled to (only) the cell. Here, its still coupled, but less strongly. So even though it's now also coupled to the process and the anatomy, it gets outweighed by the stronger connection to the cell.

I expect that this could be "solved" with modifying how edge scores are calculated?

patrickkwang commented 4 years ago

I imagine we could get the answer you want here by modifying how quickly the edge weights saturate. My concern is that there are other cases where we would prefer the parameters we have now. Before changing anything, I'm going to browse through all the issues and emails I can find and compile a list of queries that we want to go a certain way. There will be some bias toward things we don't/didn't like, but at least that will give us a few regression tests for parameter modifications.

cbizon commented 4 years ago

I agree with your concern. Part of recording this one and a few others is to start to assemble a corpus that we can test against...