Implement CCA-classes that can account for sample groups

JohannesWiesner commented 1 year ago

Hi James, thought I would open this as an issue:

It would be nice if cca-zoo would offer methods that allow modeling sample groups (e.g. patients vs. healthy controls) in the datatsets. Currently, one does see a lot of papers, where people either compute n CCAs for each sample group or "ignore" these covariance structures and compute a CCA on the whole dataset. Most of the time the authors then somehow compare weights, loadings or variate scores between the subject groups to get an idea of how groups differ in their X-y relationships (or what relationship is shared). More elegant would be to implement supervised models that have knowledge about covariance structures in the dataset (similar to GSCCA, except that now we are talking about sample groups, not feature groups).

There seem to exist methods like joint ICA / joint sparse CCA that seem to be able to handle sample groups:

Correa et al. (2010)
Fang et al. (2016), with a MATLAB implementation found here (This package also seems to offer helper functions that can visualize shared and unique variances between sample groups, "modules" as the authors call them)
MATLAB Toolbox "FIT" implements joint ICA, parallel ICA and CCA-Joint ICA. This toolbox seems also to be actively maintained but is 1. intended to be used as GUI-software and 2. strictly tied to "SPM contrast images, EEG signals or SNP data" and therefore not usable when users have other data modalities.

jameschapman19 commented 1 year ago

Yeah I think this is a really interesting research direction. Tagged as possible enhancement. Inclined to stick with CCA models to avoid feature creep! ICA stuff possibly for mvlearn assuming it's still maintained

jameschapman19 commented 1 year ago

Yeah the most immediately implementable one looks like JSCCA Fang 16

jameschapman19 / cca_zoo

Implement CCA-classes that can account for sample groups #180