micom-dev / micom

Python package to study microbial communities using metabolic modeling.
https://micom-dev.github.io/micom
Apache License 2.0
89 stars 18 forks source link

[feature] Transition the phenotype associations to non-parametric tests #171

Closed cdiener closed 5 months ago

cdiener commented 5 months ago

Purpose

The former strategy of identifying metabolite:phenotype associations with coefficients from a LASSO models has proven to be quite unstable due to (a) the non-uniqueness of the coefficients and (b) instability across scikit-learn versions and initialization. This PR switches this to a more stringent approach that uses Mann-Whitney U or Spearman rho tests to assess metabolite-level associations. A LASSO models is still fit to assess the overall/global association with the phenotype.

Visualization

The visualization is similar but now shows a confusion matrix for binary outcomes. A quantitative effect measure will be used instead of coefficients.

example visualization

Side effects

This adds new example data sets to help test and document the new functionality.

TODO