nolanlab / spade

SPADE: Spanning Tree Progression of Density Normalized Events
Other
46 stars 23 forks source link

Spade clusters number #132

Closed joanqcflow closed 7 years ago

joanqcflow commented 7 years ago

Hello!

I am a new user of Spade and i'm processing some classical flow cytometry data (13 parameters) into Spade. I am able to generate trees but the choice of the number of clusters is still confusing me. What is the right manner to choose this number. Is there any risk to under- or over-estimate this number ?

Thank you for your help and answers.

Joan

zbjornson commented 7 years ago

Hi Joan -

It's a bit of trial-and-error/compromise. A lot of people want one cell type per cluster, which is basically impossible to achieve. With too few clusters, you will have a mix of different cell types in each cluster, which is decidedly bad. With too many clusters, it's sort of more difficult to look at but the clusters are more likely to be "pure." Thus, err on the side of too many clusters. The best number is specific to your dataset (e.g. a cell line versus depleted/enriched blood versus whole blood). The default in most SPADE applications is 200 clusters and does pretty well for many datasets, but 50 to 150 may work better. Try a few...

Hope that helps.

joanqcflow commented 7 years ago

Hi Zach!

Thanks a lot ! It really clarifies the issue. I'll try different combinations.

Cheers