kstreet13 / slingshot

Functions for identifying and characterizing continuous developmental trajectories in single-cell data.
259 stars 42 forks source link

Slingshot trajectory depends on number of clusters #168

Closed aaz398 closed 2 years ago

aaz398 commented 2 years ago

Hi Kelly,

I performed clustering on my single cell dataset and obtained six clusters (clusters 0-5). When I started working with slingshot, I used a subset of my dataset (3 clusters only, clusters 3-5 located next to each other which I would expect at the end of the trajectory) and obtained a slingshot curve and performed tradeseq analysis. Now I am working with the entire dataset (6 clusters) and I am noticing the slingshot curve is different. Specifically the emphasis on cluster 3 in the pseudotime trajectory is minimized when looking at the entire dataset. Is this expected?

Thank you.

kstreet13 commented 2 years ago

Hi @aaz398,

I can't say anything specifically about your data, but in general yes, I would expect Slingshot to provide different outputs on the full data vs. a subset. That said, if there is a noticeable separation between the clusters and you think they shouldn't be included in a common trajectory, there are some parameters you can adjust that could allow Slingshot to detect such differences.

I am thinking specifically of the dist.method and omega parameters. By default, Slingshot will construct an MST that connects all clusters, but setting omega = TRUE (or a positive value) opens up the possibility of some clusters being separate from others. This separation is based on the distances between clusters, which isn't always robust, so it may also be beneficial to set dist.method = 'mnn', which uses mutual nearest neighbors to determine cluster distances.

If there is sufficient separation between your groups of clusters, Slingshot may be able to pick that up and construct distinct trajectories for each group.

Hope this helps! Kelly