Open LeonieBorne opened 4 years ago
Following #4, I propose we just use simulated data for this issue and issue #8 potentially? I am happy fore people to take the CCA part and data simulation from here: https://github.com/htwangtw/cca_primer/blob/master/cca_notebook.ipynb @diiobo has experience with PLS before! I will let her talk about what she can help here :smile:
Following #4, I propose we just use simulated data for this issue and issue #8 potentially? I am happy fore people to take the CCA part and data simulation from here: https://github.com/htwangtw/cca_primer/blob/master/cca_notebook.ipynb @diiobo has experience with PLS before! I will let her talk about what she can help here 😄
I think it's @fBeyer89 who had experience with PLS (definetly not me =)) )
Damn sorry!!!!!! Jump on issues if people think they can take up something :P
Is this section about using dimension reduction as a preprocessing step for CCA/PLS, or is it meant to also stand alone?
I am not sure if its the best to have this section a standalone - there are many ways to achieve this. You can do it as part of the preprocessing (e.g. Smith 2015) or integration with the CCA/PLS algorithm. I would personally pick a type of workflow to prevent complication.
@fBeyer89 I was thinking of showing the impact of using dimension reduction as a preprocessing step for CCA/PLS in this tutorial, but I am happy to change the roadmap and include that on another tutorial that focus on a specific pipeline! Maybe like the one in Smith 2015 as @htwangtw proposed?
I personally will keep thing simple.
Tutorial 2. Data reduction
Would you like to participate in the writing of this tutorial? Or do you have a question about this tutorial? Let us know here!
Description
This tutorial focus on dimensionality-reduction techniques (PCA, ICA, etc.) that can provide useful data preprocessing when the number of variables exceeds the number of samples.
Useful references