monarch-initiative / pheval

A framework for empirical evaluation of phenotype matching and prioritisation
https://monarch-initiative.github.io/pheval/
Apache License 2.0
12 stars 1 forks source link

EPIC: Set up a basic comparative analysis for PhEval #2

Open matentzn opened 2 years ago

matentzn commented 2 years ago

This issue will be broken down into further chunks while we continue to grow our understanding of PhEval. It serves as the description of a "starter" project which I estimate to take around 2 months to complete.

Some initial research questions:

  1. How does upheno 2 lattice, upheno 2 equivalence, upheno 1 affect semantic similarity scores? Answering this will require a nice characterisation of semantic similarity results over time, possibly involving things like top 100 lost and gained scores, distribution of change difference, etc.
  2. How does upheno 2 equivalence plus ML mappings affect semantic similarity against upheno 2 equivalence without additional mappings? same as above, just slightly different preprocessing.
  3. Provide a cursory analysis (very, very basic) that allows to measure the effect of a specific semantic similarity table on the performance of Exomiser. This is a pretty tough problem to solve, as right now, exomiser needs to be recompiled when for changes to the tables to be reflected. The analysis should be able to visualise changes between different exomiser runs.
  4. Provide a simple python CLI tool with click that runs in the PhEval docker container and wraps pairwise comparisons for 1, 2 and 3 above and spits out the comparative analysis as a markdown document.

Potential pitfalls

matentzn commented 2 years ago

@souzadevinicius (to remember your GH handle)

matentzn commented 1 year ago

@souzadevinicius

matentzn commented 1 year ago
pheval-utils compare-semsim --left profile1.tsv --right profile2.tsv --output results.json