kstreet13 / slingshot

Functions for identifying and characterizing continuous developmental trajectories in single-cell data.
265 stars 43 forks source link

how to filter cells before running slingshot? #150

Closed FADHLyemen closed 3 years ago

FADHLyemen commented 3 years ago

Hi, do you have any recommendations on how to filter cells before running slingshot? does slingshot sensitive to the quality of the data? any recommendation?

kstreet13 commented 3 years ago

Hi @FADHLyemen,

That's a really good question! Regarding the second part, yes Slingshot is definitely sensitive to the quality of the data, but some types of quality issues are more manageable than others. QC, filtering, normalization, dimensionality reduction, and clustering can all play a role in this. For example, if the clustering looks "messy" in the reduced dimensional space, then changing the extend argument can make a big difference.

As for how to filter cells, this is also something of an open question. I think most people will follow guidelines similar to the Seurat tutorials and filter out barcodes above and below certain read count thresholds, but this is not necessarily the best approach. I think a more nuanced approach is called for and I would encourage you to check out the chapter on Quality Control in the book Orchestrating Single-Cell Analysis with Bioconductor.

Best, Kelly

FADHLyemen commented 3 years ago

@kstreet13

I have these cells which is not cluster with majority of the cells image I need to remove them because they looks weird image

And probably not make the accurate fit image

is it a good justification because they have high unbalanced score

image

any ideas?

FADHLyemen commented 3 years ago

@kstreet13 could I delete cells with high unbalanced score? Thank you

kstreet13 commented 3 years ago

Unfortunately, I don't know enough about your dataset to make a strong recommendation. The high imbalance score can indicate that it is in inappropriate to fit a single, common trajectory, but it is not meant to be an outlier detection method. Slingshot will always fit a trajectory, even when it may not be appropriate, but deciding which cells to include in your analysis is up to you. If the outlier cells cluster separately from the main group, you may be able to make a biological argument for why they should be unrelated (or at least not connected by a trajectory). I have done this sort of analysis before where we fit a trajectory to a subset of interest, such as a particular type of T cells.