kstreet13 / slingshot

Functions for identifying and characterizing continuous developmental trajectories in single-cell data.
258 stars 42 forks source link

confusion about the end of a lineage #252

Open Dooozy opened 1 week ago

Dooozy commented 1 week ago

Hi, Thank you for producing and maintaining this package, it gives me good insight of lineage trending. I set an initial cluster, but the endpoints of the lineage somehow fall outside the final cluster.

Here is my object in seurat cluster: plot1

I want the cluster 2 to be the root, then run the below code:

sim <- slingshot(sim, 
                 clusterLabels = 'celltype',  
                 reducedDim = 'UMAP',  
                 start.clus= "2", 
                 end.clus = NULL  
)

it did produce lineages that started in cluster 2, but the the curves are confusing:

> SlingshotDataSet(sim) 
class: SlingshotDataSet 

 Samples Dimensions
   37747          2

lineages: 3 
Lineage1: 2  3  1  0  5  
Lineage2: 2  3  1  7  
Lineage3: 2  4  6  

curves: 3 
Curve1: Length: 19.701  Samples: 21594.93
Curve2: Length: 18.412  Samples: 17134.35
Curve3: Length: 23.042  Samples: 21914.49

plot5

Then I found my question may related to some optional arguments. I refer to #118 and run the below code:

sim <- slingshot(sim, 
                 clusterLabels = 'celltype',  
                 reducedDim = 'UMAP',  
                 start.clus= "2", 
                 end.clus = NULL,     
                 extend = 'n',
                 stretch =0
)

theSlingshotDataSet(sim) is:

Samples Dimensions
   37747          2

lineages: 3 
Lineage1: 2  3  1  0  5  
Lineage2: 2  3  1  7  
Lineage3: 2  4  6  

curves: 3 
Curve1: Length: 13.48   Samples: 24681.55
Curve2: Length: 9.8714  Samples: 16468.86
Curve3: Length: 12.416  Samples: 13229.51 

plot4 this looks better, but the curves still not end in place as I expected

Take lineage 3 for example. I think if I have 'stretch =0', the end of the curve would be the center of the cluster 6. When I run:

colors <- colorRampPalette(brewer.pal(11,'Spectral')[-6])(100) 
plotcol <- colors[cut(sim$slingPseudotime_3, breaks=100)] 
plotcol[is.na(plotcol)] <- "lightgrey" 
plot(reducedDims(sim)$UMAP, col = plotcol, pch=16, asp = 1)
lines(SlingshotDataSet(sim), lwd=2, col=brewer.pal(9,"Set1"))
legend("right",
       legend = paste0("lineage",1:3),
       col = unique(brewer.pal(6,"Set1")),
       inset=0.8,
       pch = 16)

plot3 Many cells in blue (in the left) seems not belong to either clutser 2, 4 or 6. You can check on this, there are few cells left to UMAP-1 0 in clutser 2, 4 and 6. : plot2

How can I make the trajectory end at the cluster 6 in lineage3? Could you please explain why is this happening? Thank you!

kstreet13 commented 1 week ago

Hi @Dooozy,

I definitely think you had the right idea setting extend = 'n' and stretch = 0 and I'm also surprised by some of those cells on the far left being included in lineage 3. In general, however, I would advise against running Slingshot on a UMAP dimensionality reduction, as they are not easily reproducible. You could try PCA or other methods (I should note that Slingshot is not limited to 2 dimensions, so you can run it on, say, the top 10 PCs). Also, Louvain clustering can cause issues such as this, due to its imprecise nature. There aren't many, but there are still some cells from cluster 6 on the far left and that may be the problem.

Otherwise, just based on the shape of your data, I don't think there's any way you could find 3 lineages that make sense. It looks like a single arc of points, so I think 2 lineages (with a starting point in the middle) is the most that could make sense. This is also related to the clustering, and may be an indicator of over-clustering (more clusters lead to more spurious branching events).

Hope this helps! Kelly