lazappi / clustree

Visualise Clusterings at Different Resolutions
https://lazappi.github.io/clustree/
GNU General Public License v3.0
215 stars 15 forks source link

Filtering nodes in the clustering graph #77

Closed repeatpipettor closed 2 years ago

repeatpipettor commented 2 years ago

I am using clustree to create clustering trees for my scRNA-seq data. For display purposes, I would like to filter out the smallest nodes (clusters) that contain fewer than a certain number of cells. Is this possible? I did try playing around with existing filters for edges (count_filterand prop_filter), but these did not filter out the small clusters. Thank you!

lazappi commented 2 years ago

Hi @repeatpipettor

Thanks for giving {clustree} a go! You are correct, the filters above are for edges rather than nodes. This is because the clustering tree algorithm is designed to show relationships between clusterings and it isn't really clear how to handle a situation where a sample is assigned to clusters in one clustering but not in another (because the cluster it was assigned to has been filtered).

An alternative might be to bundle all your small clusters into one label. That would let you still include them in the tree but avoid cluttering it up too much. To do this you would need to modify your data before passing it to the clustree() function. If the small clusters only occur at higher resolutions a simpler approach would be to just remove those and only show the lower resolutions.

Hope that helps.

repeatpipettor commented 2 years ago

Thank you for the helpful explanation! This is good to know. I will give your suggestions a try.