kstreet13 / slingshot

Functions for identifying and characterizing continuous developmental trajectories in single-cell data.
259 stars 42 forks source link

deviated lineage curve #166

Closed pchiang5 closed 2 years ago

pchiang5 commented 2 years ago

Hello, I generated the pseudotime and weight data with the code below sce <- slingshot(sce, clusterLabels = 'GMM', reducedDim = 'PCA', start.clus = '4', end.clus = c( '0','5', '2', '1')))

When I proceeded to get the expression curve of a gene (code below), there were high counts at the beginning of the pseudotime (middle panel) that were not there in the corresponding UMAP plot with raw counts (right panel).

sce <- fitGAM(sce,  parallel = TRUE,  verbose = TRUE,  nknots = 7, BPPARAM = BiocParallel::bpparam())
plotSmoothers(sce, assays(sce)$counts, gene = GOI) 

The Tradeseq team suggested that the abnormal counts could source from the yellow cluster. I also believe lineage 2 shall go through the middle of the yellow cluster directly based on my experimental design and knowledge.  Are there any parameters to test for a more reasonable trajectory?  Thank you.

image

kstreet13 commented 2 years ago

Hi @pchiang5,

Sorry for not responding sooner! (I'm on the tradeSeq team and saw your issue there, but was busy with other things).

Anyway, I think Koen was correct that this is likely being caused by lineage 2 trying to go through the darker blue cluster (it's worth checking, though, in case it's something else). That's probably due to the shape of the light blue cluster, which makes it appear "closer" to the dark blue than the yellow. I think you might be able to avoid this by using a different distance metric to construct the initial MST, such as dist.method = 'simple' (for Euclidean distance) or dist.method = 'mnn' (for a robust distance based on mutual nearest neighbors).

Let me know if I'm mistaken about what is causing the issue or if changing the distance metric doesn't resolve it! Best, Kelly

pchiang5 commented 2 years ago

Thank you @kstreet13.

Eventually, lineage 2 goes through the yellow cluster when I used the 3D UMAP and dist.method = 'simple' to feed slingshot() (somehow the dist.method = 'slingshot' produced a spurious extra branch on the map). The resultant expression curve turned out reasonable with tradeseq.

The high dimensional PCA data did not work with neither dist.method='simple' nor dist.method='mnn'.