Closed ielis closed 1 month ago
Related to #126
The user can choose the phenotypic features to test by using Specified terms MTC filter strategy.
from genophenocorr.analysis import CohortAnalysisConfiguration
config = CohortAnalysisConfiguration()
config.specify_terms_strategy(("HP:0001250", "HP:0001166"))
It is better to test only a smaller number of phenotypic features to decrease the false discovery rate and mitigate the impact of multiple testing correction. Doing less tests is better for statistics and for the environment!
I think there are 2 things that need to be done here. First, we need to present counts of phenotypic features. We have a function that does that:
This is good basic functionality, and we can add more convenience if we add term labels and (maybe) even return as a pandas
DataFrame
:The frame could possibly also break down the count` to the number of direct and indirect (implied by the annotation propagation rule) annotations.
Second, we need to add filter the phenotypic features prior running analysis. We already do one such filtering using
min_perc_patients_w_hpo
configuration option. We may want to add another filter that lets the user choose a set of HPO terms to test. Note, the ancestors of these terms will not be tested!We need to think how to do this.
Related to #44 , #98