Select the list of effector genes for "out-of-sample" validation

opentargets / issues

Issue tracker for Open Targets Platform and Open Targets Genetics Portal

https://platform.opentargets.org https://genetics.opentargets.org

Apache License 2.0

12 stars 2 forks source link

Select the list of effector genes for "out-of-sample" validation #3528

Open addramir opened 1 week ago

addramir commented 1 week ago

Related to https://github.com/opentargets/issues/issues/3500. As discussed before we should select a list of gene-EFO pairs for additional validation of resulting L2Gs. These out-of-sample effector genes will not be participating in training the model. Current idea is to use Eric Fauman's list of genes since we use only our curated old list and chembl for training.

addramir commented 6 days ago

The draft plan: 1) Take Eric's Fauman list (we are not using it for training). 2) Select best CS as gold positive and assign gold negatives, similar to what we do with training. 3) Use it to validate the model, e.g. FP, FN, TP, TN using l2g>=0.5. Compare it with jsut distance approach (closest by tss) and holistic approach.