Closed mebbert closed 6 years ago
Hi @mebbert,
Can you clarify what you mean by "pre-defined cluster"?
Do you mean a flat set of cluster labels for the samples, e.g. if we have 5 samples, something like c("cluster1", "cluster1", "cluster2", "cluster3", "cluster3")
? Or a hierarchical clustering of samples (generated from elsewhere)? Or something completely separate?
Thank you for your prompt response.
I'd like to run SigClust on the exact hierarchical cluster I generated in pheatmap, but I'm struggling to generate the same cluster directly in SigClust. So, it would be nice to be able to pass the pheatmap object into SigClust.
I'm probably missing something. Here are my pheatmap settings:
pheatmap(adj_contrast(sig.heat.symp.ctrl, 0.5),
clustering_distance_rows="correlation",
clustering_distance_cols=dist((1-cor(sig.heat.symp.ctrl, method = "pearson"))),
clustering_method="complete",
cluster_cols = TRUE,
cluster_rows = TRUE)
@mebbert, sorry for the delay.
Unfortunately, the shc
function needs access to the original data matrix (sig.heat.symp.ctrl
), and I don't think this information is available in the output of pheatmap
. (Let me know if I'm wrong.)
Fortunately, it shouldn't be to hard to run shc
on your data set. Although, if your matrix is incredibly large, it might take a while for the analysis to run.
If you want to test for significance of clustering in the rows using "complete" linkage and Pearson correlation as in the code you've posted, we just need to specify metric = "cor"
and linkage = "complete"
to the shc
function. (The other parameters, null_alg=
and ci=
have to be set to non-default values because correlation-based clustering violates some assumptions of the default algorithm.)
shc(adj_contrast(sig.heat.symp.ctrl, 0.5),
metric="cor", linkage="complete",
null_alg = "2means", ci = "2CI")
Similarly, to test for significance of clustering in the columns, we can run:
shc(t(adj_contrast(sig.heat.symp.ctrl, 0.5)),
metric="cor", linkage="complete",
null_alg = "2means", ci = "2CI")
(We simply need to transpose the data matrix with t()
because shc
tests for significance in the rows of the input matrix.)
Hope this is helpful. Let me know if you have any more questions.
@pkimes, thank you. Looks like I had a mistake in my pheatmap parameters. I was using dist
instead of as.dist
for the correlation metric, so I was re-calculating distances based on the correlations rather than simply converting them to a dist
object.. That's why I couldn't reproduce the same cluster in shc.
Thanks for your help. Sorry for the confusion.
Hi, It would be great if it were possible to provide sigclust2 a pre-defined cluster (e.g., from pheatmap).