Closed chainorato closed 3 years ago
Hi, scaling variance-stabilised data from DESeq2 should not be necessary for PCA. However, if you simple try t(scale(t(vsd)))
, does it produce an error? Have you pre-filtered your raw counts for genes with 0 counts across all samples?
Hi, Thank you for your suggestion.
Setting t(scale(t(vsd)))
still produced the same error.
However, I didn't pre-filtered raw counts for genes with 0 counts across all samples. So I pre-filtered them by following codes.
keep <- rowSums(counts(dds)) > 0
dds <- dds[keep,]
Then I created the plot again. Setting scale=False
returned the exact same plot,
and setting scale=TRUE
became worked. Here is the plot (scale=TRUE
) that was produced.
% variation of PC1 decreased
After that, I tried to use t(scale(t(vsd)))
instead of vsd
. The resulting plot was exactly same whether I set scale=TRUE
or scale=FALSE
.
Could you confirm for me whether scaling variance-stabilised data from DESeq2 is still necessary for PCA? Because scaling decreased % variation of PC1 from 57.5% to 36.71%, even though the plot looks the same.
Cool, so, the issue was related to genes of constant variance, i.e., all zeros.
I don't think that scaling the data is necessary when coming via the DESeq2::vst()
route. If the data was more 'difficult', like, e.g., proteomics data, then I would scale it. It is a question with no real concrete answer, though. It may have been asked a few times before on, e.g., StackExchange.
Please re-open the issue if problems still occur!
Hi, Thank you very much for developing this package.
I have just set up PCAtools today and I tried to make a PCA plot of expression profiles (transformed by
vst
function in DESeq2) by the following code.It smoothly produced the following nice plot.
However, when I set the scale parameter to TRUE for standardization.
p <- pca(vsd, metadata=samples, center=TRUE, scale=TRUE)
There is an error.
The documentation (
?svd
) states that it is a numeric or complex matrix whose SVD decomposition is to be computed. However, I could not understand how infinite or missing values popped up in 'x' which I am unsure which matrix it is.My questions are