kstreet13 / slingshot

Functions for identifying and characterizing continuous developmental trajectories in single-cell data.
259 stars 42 forks source link

Are slingshot pseudotime value comparable across diverging lineages within a trajectory? #169

Closed Sophia409 closed 2 years ago

Sophia409 commented 2 years ago

Hi, all

My data contains 3 lineages that share a common initial state but branch and terminate at different clusters (2, 4, 5). Among these branching lineages, lineage 3 ends at pseudotime value 30. But lineage 1 and 2 reach to pseudotime value 40. Why lineage 3 ends so early? How to understand different lengths of pseudotime across lineages? Does it mean that cells in lineage 3 reach to the terminal state at earlier time, then become static, while other lineages are still differentiating? It seems weird.

图片

And when I apply 3d plot to this dataset, it shows a more clearly separated trajectory. We can see that the path lengths of 3 lineages are close. Why the pseudotime length of lineage 3 calculated by slingshot is shorter than others? 图片

After I checked the real time identities of these clusters, I found that cluster 2 in lineage3 contains more cells at early stage while cluster 4 and 5 in other lineages contain more cells at late stage. Even so, cluster 2 covered cells at all stages. It means that cells in cluster2 still exist at late stage. This is inconsistent with the slingshot result that they disappear at later time.

图片

Could you help me figure it out here? Would be greatly appreciated!

Best regards and many thanks! Sophia

kstreet13 commented 2 years ago

Hi @Sophia409,

These are good questions, so thanks for submitting them! In general, I think these results look pretty good and there's nothing to be too worried about.

One important thing to note is that Slingshot doesn't use any "real time" information, only transcriptional information. So the trajectories it produces are based on gene expression differences, not actual time (we have occasionally emphasized this point by using phrases such as "transcriptional distance" in place of "pseudotime"). This means that the length of a lineage is more related to the amount of change in gene expression than the actual time required to make those changes.

With that said, I think it makes sense that your Lineage 3 is shorter than the others, considering that it seems to contain only 3 clusters, as opposed to 4. But again, that difference in length is not indicative of the speed with which cells transition along each lineage. Similarly, the fact that that lineage's pseudotime values stop earlier does not mean those cells "disappear." Rather, they are just done altering their transcriptional profile.

The real time data you showed in the last figure is the only indication we have of the actual timings of these transitions. It is interesting that cluster 2 shows a lot of cells from the earliest timepoint and (if the trajectory is correct) this might actually indicate that lineage 3 is moving quickly. Although it seems strange that the starting cluster (cluster 1) has more cells from later timepoints. I'm honestly not sure what to make of that, but it probably requires someone with more domain knowledge than me.

Hope this helps! Kelly

Sophia409 commented 2 years ago

Hi,Kelly @kstreet13 Thank you very much. Your answer is very help.

It's a good thing you reminded me to check if the trajectory is correct. After checking the expression of these clusters, I find cluster 0 is the real starting cluster, not cluster 1. So I reapplied slingshot analysis.