kstreet13 / slingshot

Functions for identifying and characterizing continuous developmental trajectories in single-cell data.
265 stars 43 forks source link

Minimum spanning tree - always on two dimension reduction? #160

Closed lucygarner closed 2 years ago

lucygarner commented 2 years ago

Hi,

I noticed in your vignette that you perform the cluster minimum spanning tree on the first two principal components? Would it make sense to generate the MST on a greater number of PC dimensions and have you tried this? Would using more than two dimensions make it difficult to visualise the downstream trajectories?

Best wishes, Lucy

kstreet13 commented 2 years ago

Hi @lucygarner,

Great question! Briefly, yes it would definitely make sense to use more PCs (it's not limited to two) and it can make visualization a little harder, but not impossible.

We have done this with some of our published analyses, using 5-20 PCs for Slingshot and then visualizing on the top 3 or some other embedding (tSNE/UMAP). If you're using PCs for visualization, then you shouldn't need to adjust the curves, you can just plot the first 2 or 3 dimensions. If you want to use tSNE/UMAP/etc., you can try the embedCurves function (discussed here). It is designed for this sort of scenario and it attempts to translate the curves from one embedding to another, which doesn't always work right away (you may want to play with the spar argument from smooth.spline in order to adjust the flexibility of the curves when using a nonlinear dimensionality reduction).

Best, Kelly