LeonieBorne / plstuto

Tutorials to apply cross decomposition methods in python
MIT License
7 stars 8 forks source link

Tutorial 2. Data reduction #7

Open LeonieBorne opened 4 years ago

LeonieBorne commented 4 years ago

Tutorial 2. Data reduction

Would you like to participate in the writing of this tutorial? Or do you have a question about this tutorial? Let us know here!

Description

This tutorial focus on dimensionality-reduction techniques (PCA, ICA, etc.) that can provide useful data preprocessing when the number of variables exceeds the number of samples.

Useful references

htwangtw commented 4 years ago

Following #4, I propose we just use simulated data for this issue and issue #8 potentially? I am happy fore people to take the CCA part and data simulation from here: https://github.com/htwangtw/cca_primer/blob/master/cca_notebook.ipynb @diiobo has experience with PLS before! I will let her talk about what she can help here :smile:

diiobo commented 4 years ago

Following #4, I propose we just use simulated data for this issue and issue #8 potentially? I am happy fore people to take the CCA part and data simulation from here: https://github.com/htwangtw/cca_primer/blob/master/cca_notebook.ipynb @diiobo has experience with PLS before! I will let her talk about what she can help here 😄

I think it's @fBeyer89 who had experience with PLS (definetly not me =)) )

htwangtw commented 4 years ago

Damn sorry!!!!!! Jump on issues if people think they can take up something :P

fBeyer89 commented 4 years ago

Is this section about using dimension reduction as a preprocessing step for CCA/PLS, or is it meant to also stand alone?

htwangtw commented 4 years ago

I am not sure if its the best to have this section a standalone - there are many ways to achieve this. You can do it as part of the preprocessing (e.g. Smith 2015) or integration with the CCA/PLS algorithm. I would personally pick a type of workflow to prevent complication.

LeonieBorne commented 4 years ago

@fBeyer89 I was thinking of showing the impact of using dimension reduction as a preprocessing step for CCA/PLS in this tutorial, but I am happy to change the roadmap and include that on another tutorial that focus on a specific pipeline! Maybe like the one in Smith 2015 as @htwangtw proposed?

htwangtw commented 4 years ago

I personally will keep thing simple.