monarch-initiative / pheval

A framework for empirical evaluation of phenotype matching and prioritisation
https://monarch-initiative.github.io/pheval/
Apache License 2.0
12 stars 1 forks source link

Create new pheval-utils function to generate semsim score distribution plots #187

Open souzadevinicius opened 1 year ago

souzadevinicius commented 1 year ago

pheval-utils semsim-comparison method will have a new option to generate semsim score distribution plot

e.g

Usage: pheval-utils semsim-comparison [OPTIONS]

Compares semantic similarity profiles

Args: input (List[Path]): File paths semantic similarity profiles output-folder (Path): Output folder path for the comparisons. score_column (str): Score column that will be computed (e.g. jaccard_similarity) analysis (str): There are two types of analysis: heatmap - Generates a heatmap plot that shows the differences between the semantic similarity profiles using the score column for this purpose. Defaults to "heatmap". percentage_diff - Calculates the score column percentage difference between the semantic similarity profiles. distribution - Generate plots comparing semsim score distribution.

This semsim-comparison method should allow any arbitrary number of inputs.

Two plots must be generated:

  1. histogram bars plotted for each every input file, one by one, from top to bottom
  2. distribution plot with side-by-side bars and distribution lines for each input
matentzn commented 1 year ago

Heatmap should have a second parameter "-t/--terms" which allows restricting to specific terms for the heatmap.