kstreet13 / slingshot

Functions for identifying and characterizing continuous developmental trajectories in single-cell data.
262 stars 44 forks source link

plotGenePseudotime color by SeuratClusters #105

Closed EAC-T closed 3 years ago

EAC-T commented 3 years ago

Hi everyone, Thank you so much for creating slingshot!! It's the best for my data so far.

I have a SEURAT object and I followed the tutorial and did this:

sds <- slingshot(Embeddings(SeuratObj, "umap"), clusterLabels = SeuratObj$seurat_clusters, start.clus = 2, stretch = 0)

cell_colors_clust <- cell_pal(SeuratObj$seurat_clusters, hue_pal()) plot(reducedDim(sds), col = cell_colors_clust, pch = 16, cex = 0.5) lines(sds, lwd = 2, type = 'lineages', col = 'black')

exp<-GetAssayData(SeuratObj, slot = "data") plotGenePseudotime(sds, "gene1", exp )

I was wondering first, how can I plot more than 1 gene at a time? and can I show the seurat cluster colors on the figure?

Thank you a lot

kstreet13 commented 3 years ago

Thanks for the feedback! I should note that we are currently in the process of overhauling the plotting functions and moving them to a separate package, so this answer may not be relevant for very long. That said...

The plotGenePseudotime function passes any additional arguments to plot, so you should be able to layer one over top of an existing plot by specifying add = TRUE (though this might get confusing if both genes are represented by black smoothers, so you may want to use this version of the function, which lets you specify a color for the loess curve). Similarly, the col argument should work as usual, so adding col = cell_colors_clust should do the trick, in your case.

Also, as a minor note, the official slingshot vignette can be found here.

EAC-T commented 3 years ago

Thank you it's working.

I'm trying to use Tradeseq as well. To fit the GAM I did this: sds <- slingshot(Embeddings(SeuratObj, "umap"), clusterLabels = SeuratObj$seurat_clusters, start.clus = 2, stretch = 0) exp<-GetAssayData(TregEd_6, slot = "data") set.seed(6) sce <- fitGAM(counts = exp,sds=sds, verbose = T)

Is it the correct way to do it?

Thank you a lot!

kstreet13 commented 3 years ago

Great! And yes, that is correct, assuming that exp is the matrix of raw counts (I'm not super familiar with Seurat's naming conventions, so "data" may be correct, but in a SingleCellExperiment object, the relevant assay is typically called "counts").

EAC-T commented 3 years ago

Hi Kelly,

I followed the vignette that you linked above. I have a seurat object so the first thing I did is to convert it into a SCE object

sce <- as.SingleCellExperiment(seurat) sce <- slingshot(sce, clusterLabels = 'seurat_clusters', reducedDim = 'UMAP', start.clus=2, stretch=0) summary(sce$slingPseudotime_1) colors <- colorRampPalette(brewer.pal(11,'Spectral')[-6])(100) plotcol <- colors[cut(sce$slingPseudotime_1, breaks=100)] plot(reducedDims(sce)$UMAP, col = plotcol, pch=16, asp = 1) lines(SlingshotDataSet(sce), lwd=2, col='black')

tseq<- fitGAM(sce) Error in .local(counts, ...) : unused argument (conditions = NULL)

I'm getting an error when trying to fitGAM, any idea why? Also when I plot the pseudtime, how can I get the key that shows where the pseudotime starts and ends, like in a heatmap key?

Thank you again

kstreet13 commented 3 years ago

That's a strange error, since you didn't actually use the conditions argument in your code. It may be the case that you just need to update to the latest version of tradeSeq (1.4.0), but if that doesn't work, can you show the sessionInfo()?

And I really wish there was a quick and easy way to get that key (so if you find one, please let me know!). But I usually have to do something like this:

window <- par("usr")
li <- as.raster(matrix(colors, nrow=1))
rasterImage(li, xleft = window[1]*(2/3)+window[2]*(1/3),
            ybottom = window[3],
            xright = window[1]*(1/3)+window[2]*(2/3),
            ytop = window[3]*(19/20)+window[4]*(1/20))
text(c(window[1]*(2/3)+window[2]*(1/3), window[1]*(1/3)+window[2]*(2/3)),
     rep(window[3]*(19/20)+window[4]*(1/20), 2),
     round(range(sce$slingPseudotime_1, na.rm = TRUE), digits=2),
     cex = .75, pos = 3)
EAC-T commented 3 years ago

sessionInfo() R version 4.0.3 (2020-10-10) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale: [1] LC_COLLATE=English_Canada.1252 LC_CTYPE=English_Canada.1252 LC_MONETARY=English_Canada.1252 [4] LC_NUMERIC=C LC_TIME=English_Canada.1252

attached base packages: [1] splines stats4 parallel stats graphics grDevices utils datasets methods base

other attached packages: [1] reticulate_1.18 SeuratWrappers_0.3.0 monocle_2.18.0 DDRTree_0.1.5 irlba_2.3.3
[6] VGAM_1.1-4 Biobase_2.50.0 BiocGenerics_0.36.0 Matrix_1.2-18 RColorBrewer_1.1-2
[11] dplyr_1.0.2 ggplot2_3.3.2 Seurat_3.2.2 tradeSeq_1.4.0 scales_1.1.1
[16] slingshot_1.8.0 princurve_2.1.5

loaded via a namespace (and not attached): [1] plyr_1.8.6 igraph_1.2.6 lazyeval_0.2.2 BiocParallel_1.24.1
[5] densityClust_0.3 listenv_0.8.0 GenomeInfoDb_1.26.0 fastICA_1.2-2
[9] digest_0.6.27 htmltools_0.5.0 viridis_0.5.1 fansi_0.4.1
[13] magrittr_1.5 tensor_1.5 cluster_2.1.0 ROCR_1.0-11
[17] remotes_2.2.0 limma_3.46.0 globals_0.13.1 matrixStats_0.57.0
[21] docopt_0.7.1 colorspace_1.4-1 ggrepel_0.8.2 sparsesvd_0.2
[25] crayon_1.3.4 RCurl_1.98-1.2 jsonlite_1.7.1 spatstat_1.64-1
[29] spatstat.data_1.5-2 survival_3.2-7 zoo_1.8-8 ape_5.4-1
[33] glue_1.4.2 polyclip_1.10-0 gtable_0.3.0 zlibbioc_1.36.0
[37] XVector_0.30.0 leiden_0.3.5 DelayedArray_0.16.0 future.apply_1.6.0
[41] SingleCellExperiment_1.12.0 abind_1.4-5 pheatmap_1.0.12 edgeR_3.32.0
[45] miniUI_0.1.1.1 Rcpp_1.0.5 viridisLite_0.3.0 xtable_1.8-4
[49] rsvd_1.0.3 htmlwidgets_1.5.2 httr_1.4.2 FNN_1.1.3
[53] ellipsis_0.3.1 ica_1.0-2 pkgconfig_2.0.3 farver_2.0.3
[57] uwot_0.1.8 deldir_0.2-2 locfit_1.5-9.4 tidyselect_1.1.0
[61] rlang_0.4.8 reshape2_1.4.4 later_1.1.0.1 munsell_0.5.0
[65] tools_4.0.3 cli_2.1.0 generics_0.1.0 ggridges_0.5.2
[69] stringr_1.4.0 fastmap_1.0.1 goftest_1.2-2 fitdistrplus_1.1-1
[73] purrr_0.3.4 RANN_2.6.1 pbapply_1.4-3 future_1.20.1
[77] nlme_3.1-149 mime_0.9 slam_0.1-47 compiler_4.0.3
[81] rstudioapi_0.13 plotly_4.9.2.1 png_0.1-7 spatstat.utils_1.17-0
[85] tibble_3.0.4 stringi_1.5.3 lattice_0.20-41 HSMMSingleCell_1.10.0
[89] vctrs_0.3.4 pillar_1.4.6 lifecycle_0.2.0 BiocManager_1.30.10
[93] combinat_0.0-8 lmtest_0.9-38 RcppAnnoy_0.0.16 data.table_1.13.2
[97] cowplot_1.1.0 bitops_1.0-6 httpuv_1.5.4 patchwork_1.1.0
[101] GenomicRanges_1.42.0 R6_2.5.0 promises_1.1.1 KernSmooth_2.23-17
[105] gridExtra_2.3 IRanges_2.24.0 parallelly_1.21.0 codetools_0.2-16
[109] assertthat_0.2.1 MASS_7.3-53 SummarizedExperiment_1.20.0 withr_2.3.0
[113] qlcMatrix_0.9.7 sctransform_0.3.1 S4Vectors_0.28.0 GenomeInfoDbData_1.2.4
[117] mgcv_1.8-33 grid_4.0.3 rpart_4.1-15 tidyr_1.1.2
[121] MatrixGenerics_1.2.0 Rtsne_0.15 shiny_1.5.0

I think I have the version 1.4.0 I'm encountering many problems for example, heatdata <- assays(tseq)$counts[topgenes, pst.ord] Error in assays(tseq) : could not find function "assays" I'm not sure what is happening.

Thank you a lot Kelly for all your help! I greatly appreciate it.

kstreet13 commented 3 years ago

For the newest error, you probably just need to load the SummarizedExperiment package (you'll probably also want SingleCellExperiment, just in general). So is fitGAM working now? Otherwise, I'm not sure how you would have gotten to this point (where it seems like you have a tseq object).

EAC-T commented 3 years ago

Hi Kelly

the fitGAM is working now. The heatmap is not what I will expect. This is my code:

Transfrom seurat to sce object

sce<- as.SingleCellExperiment(seurat)

run slingshot

sce <- slingshot(sce, clusterLabels = 'seurat_clusters', reducedDim = 'UMAP', start.clus=2, stretch=0)

creating slingshotdataset

slingsce<-SlingshotDataSet(sce)

GETTING PARAMETERS

pseudotimeED <- slingPseudotime(slingsce, na = FALSE) cellWeightsED <- slingCurveWeights(slingsce) countsED<-sce@assays@data@listData$counts

FITGAM

scegam <- fitGAM(counts = countsED, pseudotime = pseudotimeED, cellWeights = cellWeightsED, nknots = 7, verbose = T)

association test

assoRes <- associationTest(scegam) ####### heatmap from slingshot vignette topgenes <- rownames(assoRes[order(assoRes$pvalue), ])[1:250] pst.ord <- order(sce$slingPseudotime_1, na.last = NA)

it's here that I have problems

heatdata <- assays(scegam)$counts[topgenes, pst.ord] heatclus <- sce$seurat_clusters[pst.ord]

Everything is working now. However the heatmap is very ugly, I have 4 clusters and only 1 trajectory, I know for sure that there are DGE between clusters a lot actually. I'm not sure why the heatmap is not showing any patterns, it's like a giant pale canvas. Do you think there is anything wrong in my code?

kstreet13 commented 3 years ago

It's a little hard to say without looking at the data and heatmap, but I have a guess: I think you might want to use logcounts (or some normalized version of the counts matrix) rather than the raw counts matrix. There's probably one incredibly large count that dominates the color scale and makes everything else look the same (this happens to me all the time).

EAC-T commented 3 years ago

Hi Kelly, I tried that, it didn't work. I found this tutortial, you are the author,https://kstreet13.github.io/bioc2020trajectories/articles/workshopTrajectories.html#differential-progression-1 The heatmap looks good here, I'm just wondering how would I label the clusters on top of the heatmap. I'm almost there I think. Thank you a lot

kstreet13 commented 3 years ago

We gave the code for that figure. Based on the pheatmap documentation, you can add additional tracks with annotation_row and annotation_col.

EAC-T commented 3 years ago

ok thank you a lot Kelly. I will figure it out Thank you for all your help so far greatly appreciate it