kstreet13 / slingshot

Functions for identifying and characterizing continuous developmental trajectories in single-cell data.
259 stars 42 forks source link

Slingshot curves look weird #195

Closed madhu-coder closed 1 year ago

madhu-coder commented 2 years ago

HI, I am using slingshot, well, not exactly the same script but the modified one by the authors of this paper -https://www.pnas.org/doi/10.1073/pnas.1817715116#supplementary-materials so as to include the soft cluster assignments (similar to fuzzy c means concept) and membership matrix (a cell by cluster matrix denoting probabilities of developmental cells to belong to a particular cluster) and the centers. The modified codes are as follows: Codes for computing lineages, pseudotime, descendants, and curves: https://rdrr.io/github/lingxuez/SOUP/src/R/SOUPlineage.R For plotting the lineages and curves: https://rdrr.io/github/lingxuez/SOUP/src/R/utils_plot_lineage.R

> obj
class: SlingshotDataSet 

 Samples Dimensions
     433          2

lineages: 2 
Lineage1: 4  5  1  2  3  
Lineage2: 4  5  1  2  6  

curves: 2 
Curve1: Length: 81.06   Samples: 394.7
Curve2: Length: 75.806  Samples: 416.61

This is the slingshot dataset object of my data which I created using newSlingshotDataset function from slingshot. I used PCA for dimensionality reduction.

lineag_center <- getClusterLineages(centers, end.clust = NULL, start.clust = NULL,dist.fun = NULL)

Pseudotime_lineage_center_1 <- getLineageTime(lineag_center[[1]],  membership_mat)                            

curves_center <- getLineageCurves(log.select.expr_hund_cell_types,  lineages = lineag_center[[1]],centers= centers, membership= membership_mat, shrink = TRUE, smoother = "smooth.spline", extend = "y", drop.multi = TRUE, reweight = TRUE, shrink.method = "cosine", stretch = 2, thresh = 0.001, maxit = 15)

Above are the codes used to create lineages, pseudotime, and curves. Since a SlingshotDataset object was required for plotting, the same was created using newSlingshotDataset and an object obj was created, as shown above. For plotting, the command used was:

plotLineageCurves_edited(obj, centers = NULL, dims = c(1:2), add = FALSE,
  labels = NULL, lwd = 2, col = 6, lab.cex = 2, cex = 2,
  pos = 2, offset = 0.5, type = "both")

("edited" because I edited original function contents i.e., SlingAdjacency to SlingMST and ClusterLabels to SlingClusterLabels)

The resultant plot looked like this lineage_sling_SOUP_lineages_plt.pdf lineage_sling_SOUP_curves_plt.pdf lineage_sling_SOUP_both_plt.pdf Strangely with no data points.

For the following code:

colors <- colorRampPalette(brewer.pal(11,'Spectral')[-6])(100)
plotcol <- colors[cut(sling_pseudotime_L2, breaks=100)]
plot(rd1, col = plotcol, pch=16, asp = 1)
lines(obj_copy, lwd=2, col='black')

The plot looked like this SOUP_curve_slinshot_plt.pdf

and for the following code

colors <- colorRampPalette(brewer.pal(11,'Spectral')[-6])(100)
plotcol <- colors[cut(sling_pseudotime_L2, breaks=100)]
plot(rd1, col = plotcol, pch=16, asp = 1)
lines(obj_copy,type ='lineages', lwd=2, col='black')

The plot was SOUP_lineage_slinshot_plt.pdf

UMAP: umap.pdf PCA: pca.pdf

Also, the curve1$s is a matrix of dimension 433 cells and 257 genes (which is my original matrix), the same for curve2. Is this unusual? Because I see in the example dataset of slingshot, only 2 dim or genes are included in $s with all the cells.

> dim(obj_copy@curves$curve1$s)
[1] 433 257

Please help. Thanks!

kstreet13 commented 2 years ago

Hi @madhu-coder,

I'm not sure how much I'll be able to help, since it seems like you are adapting code that was adapted from mine, but I might be able to point you in the right direction.

First of all, the slingshot package can handle soft clustering. The SOUPR authors may have been working from an older version (it looks like their repo hasn't been updated in a few years), but this is something that has been supported for a while.

Also, the matrices of points along the curve (ie. curve$s) should have the same dimensionality as your reduced dimensional space. So if you are using the top 3 PCs, then curve$s should only have 3 dimensions, not 257. Again, I'm not really sure what's going on here, because all of this code is from a separate package, but that seems like a pretty big issue.

I would encourage you to re-try your analysis with the functions provided by slingshot, using your membership matrix for the clusterLabels argument. I think it may simplify your workflow and it will definitely make it easier for me to diagnose any problems.

Best, Kelly

madhu-coder commented 2 years ago

Hey @kstreet13! Thank you so much for your guidance! I have decided to leave that script altogether since even the paper's authors seem to have forgotten about it and are of no help I'm afraid. Thankfully, I found the help here and have used slingshot with the membership matrix as you suggested and I am getting the results. but I need to tweak the number of Principal components used to refine the result. I will come back again in case I come across some error.

Thanks once again!