igrabski / sc-SHC

Significance analysis for clustering single-cell RNA-sequencing data
87 stars 10 forks source link

interpretation of the significance results #13

Closed whiteorchid closed 11 months ago

whiteorchid commented 11 months ago

Dear author,

Thanks a lot for your great innovative tool of sc-SHC!

May I apply for your guidance on the interpretation of the significance results?

Does it mean the best cluster is 11 based on the picture of below?

The cluster used for input of new_clusters <- testClusters(data, as.character(clusters)) is the clustered labels from Seurat, which has 12 clusters.

What is the number 0.55 after cluster 8 means?

Thank you very much for your kind guidance!

Best,

image
igrabski commented 11 months ago

Thanks for trying out our tool!

The testClusters function takes a set of pre-computed clusters as input and then decides whether any of them should be merged. In this case, it looks like the 12 original clusters were merged into 11 final clusters. To see exactly how, you can compare new_clusters[[1]], which contains the new cluster labels, to your previous cluster labels as.character(clusters).

The output you are showing here is a visualization of the tree of clusters that was tested as part of our approach. The numbers next to each node or cluster represent the adjusted p-value from assessing whether the clusters below represent distinct distributions or not. As you can see, many of the adjusted p-values here are 0, suggesting high confidence in the following clusters. However, here, Cluster 8 combines two of the original clusters, but the adjusted p-value for differentiating those two clusters is 0.55, which is why we keep those two clusters together.