hnolCol / instantclue

Instant Clue - Interactive Data Analysis
http://www.instantclue.de
GNU General Public License v3.0
21 stars 4 forks source link

issue with different Instant Clue versions generating different PCA plots #29

Closed a104347 closed 2 years ago

a104347 commented 2 years ago

Hello, I am having an issue with the Principal Component Analysis generating different plots when using Instant Clue v 0.5.3 or v 0.10.10.20210315.

I performed a proteome analysis at the proteomics facility in CECAD, Cologne and obtained a PCA plot from the facility (see attached image). I wanted to plot the data myself and used Instant Clue v 0.5.3 and created PCA plots via Analysis -> Dimensional reduction -> Principal Component A. This generated plots showing Component 2 plotted against Component 1 or Component 3 plotted against Component 2 (see attached image). The later one looked like the plot I received from the facility, except for the difference in percentage and in the axis scale.

Now I downloaded the newest version v 0.10.10.20210315 and used the same data set to repeat the PCA by Right click -> Value transformation -> Dimensional reduction -> PCA (projection). Now I obtain one single PCA plot showing Component 1 plotted against Component 0 which looks the same like Component 2 plotted against Component 1 in the older version. I found it confusing that the components are now renamed but what is more important - I can't figure a way to get the second plot, the one which looks like the one from the facility..

I hope my message is not too confusing and I am looking forward to your reply. Best, Veronika

Instant Clue1

hnolCol commented 2 years ago

Dear Veronika, thank you for creating this issue.

I assume the plot from proteomics facility was created using the Perseus software which uses a different algorithm for calculating the PCA.

To create the plot: By default the new version calculates only two components and therefore you cannot create the plot using the third component.

Please go to the settings (right bottom) and find "Dimensional reduction" Settings.

image

There you can change the number of components to three (last item in the attached img.)

Upon changing this, you will be able to create the desired plot.

We are working hard on a new PCA plot which uses the Grouping function. There will be a lot of improvements to the dimensional reduction PCA plot. I will add this information to the Wiki, thank you.

Best wishes

Hendrik

a104347 commented 2 years ago

Dear Hendrik,

than you for the fast reply. This was very helpful :)

Best, Veronika

hnolCol commented 2 years ago

Superb. :) Closing this issue.

hnolCol commented 2 years ago

You can try to first normalise the data using for exmaple:

Value Transformation -> Normalization (row) -> Quantile (25-75)

which will 'remove' the abundance differences in your data. Then perform the PCA on the normalised data. This will likely produce a plot and explained variances similar to the one from the proteomics facility.