Closed Close-your-eyes closed 4 years ago
Hi @Close-your-eyes
Thanks for giving {clustree} a go! I will admit I am a bit rusty on the SC3 stability metric but I have seen similar behaviour before, although maybe not such a clear example as this one. Could you please try running this again with prop_filter = 0
? That would let us see all the edges (even the very low weight ones) which might clear things up a bit.
For reference here is the function in {clustree} that calculates the stability metric https://github.com/lazappi/clustree/blob/ac7e68d42346e96b2c6ecb358113705910130b3b/R/stability.R#L1-L26. The equivalent function in {SC3} is here https://github.com/hemberg-lab/SC3/blob/b1fcc700af1c7ea3601bf1a483324673342b8a0e/R/ShinyFunctions.R#L292-L339. They should give the same results but it is possible there is a bug somewhere.
Thanks for the quick response and giving those code references. I will try them.
Apart from that, I ran the code again, using prop_filter = 0 as you suggested. Please see the result below. This is probably what you've expected: Several cells change their identies when k.param is varied. Their frequency or proportion is low though.
Thanks for the extra plots! Yes I thought we might see something like that. My feeling is the SC3 metric is sensitive to even very small changes in samples but I haven't tested that and it would still be good to double check that the {clustree} function gives the same results as the {SC3} version.
So, I rewrote the calculate_stability function of {SC3} so that it accepts any table with different clusterings of the same cells. When I feed in the data from above I get the same "bad" stability indices as in clustree. I guess your implementation is correct then. The "problem" must be the calculation in {SC3} itself which seems to give quite high penalties. One thing is the two-times devision by "N" as desribed in their manual. If only a few cells of one cluster change their identity to several different clusters in other ks, this N increases and the stability drops a lot. If one cluster is adjacent to many others (as the case for cluster 0 in my example) such change of identities of a few cells may happen just by chance. But this is not the whole story. Maybe {SC3} is not suitable in my case. I have to think more about why exactly. But the problem is not within {clustree}, so we may close the issue. Thank you @lazappi
Thanks for following this up, it's good to know you got the same results! 😸
I think you are right about the metric and it might be worth looking at a modified version that isn't as sensitive to small changes.
Dear all,
I wanted to apply the sc3_stability calculation to clusters that I have annotated in a Seurat object. Since I could not figure out how to apply the formula by myself I wanted to make use of the implementation in the clustree function. I calculated clusterings in Seurat (louvain algorithm) where I kept the resolution constant but varied the k.param value in FindNeighbors. The respective results I provided to clustree as follows:
clustree.plot <- clustree(SO, prefix = "k.param.", suffix = "_integrated_snn_res.1.5", node_colour = "sc3_stability")
Between prefix and suffix is a number which indicates the k.param value used. When projecting the clusterings for k.params from 20 to 22 onto a tSNE embedding some clusters (e.g. cluster 0) actually appear stable (see the image below) but the sc3 stability is quite low.
Why is this the case? Did I get the concept of cluster stability wrong? Even if there are a few cells changing their identity on the edges of clusters the penalty for that would be very high. It may be that sc3 is not suitable for such adjacent clusters but rather for clearly separated ones? If this is a the case then this issue may better be raised at the sc3 repository.