gabrielodom / pathwayPCA

integrative pathway analysis with modern PCA methodology and gene selection
https://gabrielodom.github.io/pathwayPCA/
11 stars 2 forks source link

Interpretation of PC1 values among groups #92

Open masaver opened 4 years ago

masaver commented 4 years ago

Hi Gabriel, I'm looking at a cancer transcriptomics dataset with Normal and Tumor samples, where I'm trying to quantify differences in pathways activity between Normal and Tumor tissue. I was able to run the analysis with the AESPCA_pVals() function and I'm now looking at the PC1 values for the EGFR pathway( which I undertand are a proxy for the activity of this pathway) (see picture below):

egfr

I expect/know that for the specific cancer type I'm studying the EGFR pathway is more active in Tumor than in Normal Tissue.

So my question is: How do I go from interpreting positive or negative PC1 values to to saying the X pathway is more active in a given condition ?

Thanks in advance for the help.

Best ,

-Mathias

gabrielodom commented 4 years ago

Hi Mathias, I hate to be the bearer of bad news, but principal components are "unique up to a sign". That means that they can't be used to interpret whether or not a pathway is up-regulated or down-regulated, only that it is differentially regulated. We recommend that users target pathways for post-hoc analysis based on the p-values (or FDRs) yielded by the AESPCA_pVals() or SuperPCA_pVals() functions. In your case, it's mathematically appropriate to multiply the EGFR pathway PC by -1 and re-apply your statistical tests.

See this thread for a related question and answer: https://stats.stackexchange.com/a/30352/171235