Example of Scoring issue: MVP2 Paroxentine MTOR is way too high

TranslatorIssueCreator commented 3 months ago

Type: Bug Report

URL: https://ui.test.transltr.io/main/results?l=Paroxetine&i=CHEBI:7936&t=3&r=0&q=45f60838-5834-4ccd-a580-e21611cf1124

ARS PK: 45f60838-5834-4ccd-a580-e21611cf1124

Steps to reproduce:

MVP2 with Paroxetine

Screenshots:

sandrine-muller-research commented 3 months ago

When I performed this query I am unhappy to see that MTOR node (only data mining edges) is way too high in my opinion. 1) First it appears before some paths that contain look-ups from curated databases. 2) it contains only edges that are increase/decrease activity (affected by) while other results below are mixtures with more specific predicates (see CYP2D6) 3) MTOR is a master regulator that could affect many genes (high degree, or high cardinality...etc.) that is certainly what is going on here. (the scoring does not seem to be corrected for the degree). linked to issue #617 but for genes.

to summarize from a user perspective: 1) look ups from text mining should be treated differently than from curated databases 2) predicate specificity should be used to pull up results 3) the score should be corrected by the node degree in the overall graph (even if each ARA correct it at their level that does not guaranty the effect will not be present in the ARS total graph)

sierra-moxon commented 3 months ago

from TAQA:

we see again the issue of reversed directions in paths
we see results with much evidence (from the same kind of source, e.g. Text Mining) being returned with the same score as results with much evidence (from diverse sources, e.g. including curated sources). This is unintuative to the user, unless they are concerned primarily with novelty.
perhaps we can tweak the scoring to favor curated sources over text mining, but there may be unintended effects, here. (Unsecret is already punishing SEMMED, TextMining KG and curated are equal at the moment but we want to think about if there are multiple sources of evidence ranking higher than a single source)
Is this an ARS issue - not just Unsecret? (@Rosinaweber has discussed this kind of evidence-based ranking). Happens in the ARS but managed by Rosina's team.

sierra-moxon commented 3 months ago

@Rosinaweber - is this kind of ranking change part of the work you're currently doing for this phase and is this a good test case for you? :). thanks!

(ps. if you could assign a sprint that would be icing on the cake!)

NCATSTranslator / Feedback