opentargets / issues

Issue tracker for Open Targets Platform and Open Targets Genetics Portal
https://platform.opentargets.org https://genetics.opentargets.org
Apache License 2.0
12 stars 2 forks source link

Inclusion of direction of effect (for coloc) to the L2G evidence in Platform #2831

Closed buniello closed 1 year ago

buniello commented 1 year ago

We would like to include info on direction of effect (already available in the Genetics API for the variant/QTL colocalisation table -- through the QTL beta direction) to the L2G evidence in platform.

This task will facilitate future implementation of the Target Engine project in the platform. @Juanmaria-rr will check numbers re:

  1. L2G evidence without colocalisation with QTLs (%)
  2. number of contradictions across tissues for the same L2G
buniello commented 1 year ago

Hey @Juanmaria-rr I’m using Zenhub in GitHub, click this link to join my workspace and see other features available in GitHub or download the Zenhub extension and sign up with your GitHub account. Posted using Zenhub

buniello commented 1 year ago

From SO: SO:0002315 SO:0002316 SO:0002314

buniello commented 1 year ago

hi @Juanmaria-rr did you make any progress re gathering numbers to scope this task?

d0choa commented 1 year ago

I'm having a pass on the data with @ireneisdoomed. We should have a decision on the design today. @DSuveges is also in the loop

ireneisdoomed commented 1 year ago

After discussion, @DSuveges @d0choa and I have come to the conclusion that QTLs give us a significant signal when deriving loss of function, whereas gain of function cases seem to be aberrant - meaning that when we estimate the effect is a gain of function, this is usually wrong. The QTL derived effect comes from the signal with the largest effect size.

The benchmarking has been done against the ChEMBL dataset provided by @Juanmaria-rr, where all disease/target associations are considered to be protective because they are derived from drugs that have an inhibitory effect on the target (meaning that presence of target -> presence of disease).

More notes on the analysis

Code available at: https://gist.github.com/ireneisdoomed/5a452eda3cd7e58b1d0c987f5239fdbe

DRUG EFFECT
LoF GoF
COLOC EFFECT LoF 123 3 126
GoF 107 17 124
230 20
ireneisdoomed commented 1 year ago

I have prepared a table with the results commented above on the validation set. It essentially describes how QTLs inform about the directionality of t/d associations compared with ChEMBL's, for which the role of the target is known.

https://docs.google.com/spreadsheets/d/1vNt8Yn9og0J1HqVblXIGaN8Wvn5ysmz02fR3rD9EgoU/edit#gid=1143667412

ireneisdoomed commented 1 year ago

We have rerun the analysis thanks to a bug identified by @Juanmaria-rr in the drug mechanisms that we were using as gold standards in which agonists were inappropriately assigned. We considered them as LoF mechanisms when they actually describe that activation of the target is protective against the disease. The bottomline is that the LoF associations were overrepresented in the previous metrics. The LoF/GoF ratio is ~5, and not 11 as we were reporting before.

Having this in place, numbers look much more in favour when we look at the "predictive" validity of the coloc derived mechanism of action. DRUG EFFECT
LoF GoF
COLOC EFFECT LoF 113 13 126
GoF 92 53 145
205 66

As @d0choa pointed out, it is interesting to look at the breakdown when max clinical phase is factored in. With this we see a trend where the higher the clinical validity of an association, the better we are at deriving that mechanism. And this is true on both directions. mechanism_bar