kstreet13 / slingshot

Functions for identifying and characterizing continuous developmental trajectories in single-cell data.
259 stars 42 forks source link

Slingshot for flow and mass cytometry data #179

Closed france-hub closed 2 years ago

france-hub commented 2 years ago

Hello Kelly,

Thank you for sharing Slingshot! I was wondering if you have ever tried to use this package to infer trajectories for flow cytometry data. I have a single cell experiment object in my analysis and I wanted to try to look at trajectories with slingshot. Do you think Slingshot can be used?

Thank you very much Francesco

kstreet13 commented 2 years ago

Hi @france-hub,

Thanks and that's a great question! The short answer is that I'm pretty sure it can be used, I've just never tried it.

From a theoretical standpoint, there's no reason why it shouldn't work for flow cytometry data. The basic inputs are (1) a dimensionality reduction and (2) a set of cluster labels, both of which can be produced from a wide variety of data types. Certainly anything that fits into a SingleCellExperiment would seem like a prime candidate for Slingshot (I'm currently working with CyTOF data and looking for a chance to apply it, just haven't found one yet).

Like I said, I haven't tried it, but it should be fine and either way, I would be very interested to see how it works out for you! (If it works well, I might want to add a vignette on this application).

Best, Kelly

france-hub commented 2 years ago

Great! I'll keep you posted then!

Best, Francesco

france-hub commented 2 years ago

Hello Kelly,

Here is my try! I started from a sce object. For this I used CATALYST package and their wonderful workflow. Then I did this:

rm(list = ls())

library(SingleCellExperiment)
library(slingshot)
library(CATALYST)
library(dplyr)
library(magrittr)
library(ggplot2)

#Load workspace
load("Spectral_step2_CD8.rds")

#Keep only cells with UMAP coordinates (using CATALYST function)
sce_sling <- filterSCE(sce, complete.cases(reducedDim(sce))) #remove NA

#Prepare clusterLabels
clusters <- sce_sling$cluster_id
levels(clusters) <- cluster_codes(sce_sling)$cluster_annotation

#Run slingshot
sce_sling <- slingshot(sce_sling, clusterLabels = clusters, 
                 start.clus = "C2", stretch = 0)

#Create dataframe of pseudotimes
pt <- setNames(as.data.frame(slingPseudotime(sce_sling, na = FALSE)), c("pt1", "pt2","pt3","pt4"))

#Run slingCurves
curve1 <- slingCurves(sce_sling)[[1]]
cv1 <- setNames(curve1$s[curve1$ord,] %>% as.data.frame(), c("UMAP_1", "UMAP_2"))

#Create dataframe with UMAP coordinates
umap_df <- setNames(as.data.frame(reducedDim(sce_sling)), c("UMAP_1", "UMAP_2"))

#Create dataframe with UMAP coordinates and pseudotimes
df <- cbind(umap_df, pt)

#Plot according to pseudotime values
p1 <- ggplot(df, aes(UMAP_1, UMAP_2)) +
  geom_point(aes_string(color = df$pt1),
             alpha = 0.5) +
  scale_colour_viridis_c() +
  theme_minimal() + labs(colour = "Pseudotime")

tiff("./UMAPpseudo.tiff", width = 5*900, height = 5*900, res = 300, pointsize = 5)     
p1
dev.off()

#Extract UMAP_1 and UMAP_2 and plot 
UMAP_1 <- p1$data$UMAP_1
UMAP_2 <- p1$data$UMAP_2

tiff("./sling.tiff", width = 5*400, height = 5*300, res = 300, pointsize = 5)     
p1 + geom_path(aes(x = UMAP_1, y = UMAP_2), data = cv1,
               col = "black", size = 1, arrow = arrow(), lineend = "round") 
dev.off()

Please let me know what you think and if this may work!It makes sense biologically.

UMAPpseudo

Thanks Francesco

kstreet13 commented 2 years ago

Hi @france-hub,

I think this looks really good! I am familiar with CATALYST and I'm glad to see that it plays nicely with Slingshot.

My only question here is about the dimensionality reduction. On the UMAP plot, it definitely looks like there are a couple major gaps. If you're confident that those are not categorical differences, but actually just areas of low density along the trajectory, then maybe a different dimensionality reduction could be better.

I will point out that Slingshot can work on any number of dimensions (not just two), so it would be feasible to try it out on (eg.) the top 5 PCs. You can even embed the curves in a different space (such as UMAP) for the purposes of visualization, but the underlying pseudotimes might be more accurate that way.

Anyway, thanks very much for sharing! Kelly

france-hub commented 2 years ago

Thank you! I am going to close this and maybe reach out again in case I find issues. Francesco

janinemelsen commented 2 years ago

Hi!

I just wanted to comment that we published some code a while ago, showing how slingshot can be applied on flow cytometry data: https://pubmed.ncbi.nlm.nih.gov/32591399/

kstreet13 commented 2 years ago

Thanks for sharing, this looks great! Hope you don't mind if I add a link to your repo, for anyone who may come across this thread in the future looking for code: https://github.com/janinemelsen/Single-cell-analysis-flow-cytometry