taiyun / corrplot

A visual exploratory tool on correlation matrix
https://github.com/taiyun/corrplot
Other
318 stars 86 forks source link

Extract hclust information #222

Closed martin-jeremy closed 3 years ago

martin-jeremy commented 3 years ago

Hi!

Thank you for this usefull package ! I'm writting to you today, because I have an issue with the hclust method implemented in the package.

I have a correlation matrix between 15 samples of bulk RNAseq. I run corrplot() on this matrix and I obtain this plot

corrplot(cor.mat , method = "color", is.corr = FALSE,
             order = "hclust", hclust.method = "complete", addrect = 2)

image

I would like to extract the hclust() realized to order and cut the data in two group. I try to run

dist(cor.mat) %>%
  hclust(. , method = "complete) %>%
  as.dendrogram() %>%
  plot(. , horiz = T)

But it is not returning the same clustering, as it's possible to see in this dendrogram, the samples 01014V9 and 01016V9 are not clusterised as precedently.

image

It's important for my downstream analysis to keep the same clustering, how can I extract the hclust() data from corrplot() ?

Thank's for your help !

SessionInfo : ```R R version 4.1.0 (2021-05-18) Platform: x86_64-conda-linux-gnu (64-bit) Running under: Ubuntu 20.04.2 LTS Matrix products: default BLAS/LAPACK: /home/local/INSERM/jeremy.martin/bin/conda/envs/bulkRNA/lib/libopenblasp-r0.3.12.so locale: [1] LC_CTYPE=fr_FR.UTF-8 LC_NUMERIC=C LC_TIME=fr_FR.UTF-8 LC_COLLATE=fr_FR.UTF-8 LC_MONETARY=fr_FR.UTF-8 [6] LC_MESSAGES=fr_FR.UTF-8 LC_PAPER=fr_FR.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base other attached packages: [1] pheatmap_1.0.12 RColorBrewer_1.1-2 corrplot_0.90 DESeq2_1.32.0 SummarizedExperiment_1.22.0 [6] Biobase_2.52.0 MatrixGenerics_1.4.0 matrixStats_0.59.0 GenomicRanges_1.44.0 GenomeInfoDb_1.28.0 [11] IRanges_2.26.0 S4Vectors_0.30.0 BiocGenerics_0.38.0 forcats_0.5.1 stringr_1.4.0 [16] dplyr_1.0.7 purrr_0.3.4 readr_1.4.0 tidyr_1.1.3 tibble_3.1.2 [21] ggplot2_3.3.5 tidyverse_1.3.1 loaded via a namespace (and not attached): [1] bitops_1.0-7 fs_1.5.0 lubridate_1.7.10 bit64_4.0.5 httr_1.4.2 tools_4.1.0 [7] backports_1.2.1 utf8_1.2.1 R6_2.5.0 DBI_1.1.1 colorspace_2.0-2 withr_2.4.2 [13] tidyselect_1.1.1 bit_4.0.4 compiler_4.1.0 cli_3.0.0 rvest_1.0.0 xml2_1.3.2 [19] DelayedArray_0.18.0 scales_1.1.1 genefilter_1.74.0 XVector_0.32.0 pkgconfig_2.0.3 highr_0.9 [25] dbplyr_2.1.1 fastmap_1.1.0 rlang_0.4.11 readxl_1.3.1 rstudioapi_0.13 RSQLite_2.2.5 [31] generics_0.1.0 jsonlite_1.7.2 BiocParallel_1.26.0 RCurl_1.98-1.3 magrittr_2.0.1 GenomeInfoDbData_1.2.6 [37] Matrix_1.3-4 Rcpp_1.0.6 munsell_0.5.0 fansi_0.4.2 lifecycle_1.0.0 stringi_1.6.2 [43] zlibbioc_1.38.0 grid_4.1.0 blob_1.2.1 crayon_1.4.1 lattice_0.20-44 Biostrings_2.60.0 [49] haven_2.4.1 splines_4.1.0 annotate_1.70.0 hms_1.1.0 KEGGREST_1.32.0 locfit_1.5-9.4 [55] knitr_1.33 pillar_1.6.1 geneplotter_1.70.0 reprex_2.0.0 XML_3.99-0.6 glue_1.4.2 [61] evaluate_0.14 modelr_0.1.8 png_0.1-7 vctrs_0.3.8 cellranger_1.1.0 gtable_0.3.0 [67] assertthat_0.2.1 cachem_1.0.5 xfun_0.24 xtable_1.8-4 broom_0.7.8 survival_3.2-11 [73] AnnotationDbi_1.54.0 memoise_2.0.0 ellipsis_0.3.2 ```
taiyun commented 3 years ago

Try :

as.dist(1-cor.mat)

See in https://github.com/taiyun/corrplot/blob/master/R/corrMatOrder.R Line 77 ~ 80

martin-jeremy commented 3 years ago

It's working !

Thank you !