IMB-Computational-Genomics-Lab / ascend

R package - Analysis of Single Cell Expression, Normalisation and Differential expression (ascend)
21 stars 7 forks source link

runTSNE step couldn't finish #22

Open lixin4306ren opened 5 years ago

lixin4306ren commented 5 years ago

the command I used: scran_normalised <- runTSNE(scran_normalised, PCA = FALSE)

The dataset has ~18000 cells and ~20000 genes, after overnight running, it still couldn't finish.

asenabouth commented 5 years ago

Hi @lixin4306ren ,

The TSNE function runs a lot faster if you run it on PCA data instead of the expression data. This reduces the number of dimensions down to the number of genes.

If you wish to use TSNE directly on your expression data, you can extract the normalised counts using the normcounts or logcounts functions for use directly with the Rtsne function from the Rtsne package. You can leverage more cores and use the partial_pca argument with this function, that should hopefully speed up the processing time. You can then store it back into the EMSet using the reducedDim function.

lixin4306ren commented 5 years ago

Thank you for your prompt reply! Is there any significant difference between tsne results generated directly from the expression matrix and that based on PCA data? Thanks.

asenabouth commented 5 years ago

The results will look different, as that's the nature of TSNE. Here's a comparison of a TSNE generated from PCA-reduced values: pca_tsne_plot

And directly from expression data: exprs_tsne_plot