smorabit / hdWGCNA

High dimensional weighted gene co-expression network analysis
https://smorabit.github.io/hdWGCNA/
Other
316 stars 31 forks source link

Error while plotting dendrogram & module color bar is missing #164

Closed Dimmiso closed 6 months ago

Dimmiso commented 7 months ago

Hi, Thanks a lot for developing and supporting hdWGCNA! Since last week I often bump on error message and cannot figure out its cause. My flow ends up this way in some cases but not always.

Describe the bug when I plot genes dendrogram (PlotDendrogram(seurat_obj, main = title.dendr)) often I get error message:

“Error in .plotOrderedColorSubplot(order = order, colors = colors, rowLabels = rowLabels, : Length of colors vector not compatible with number of objects in 'order'.”

And plot’s lower part (with modules in colors) is missing (while dendrogramm looks OK).

Steps to reproduce It happens if I apply the flow to several different datasets but not the one Zhou_2020 (which comes with vignette). Under the link you can find the input object, as well as output workspace. I.e. if you try to make dendrogram on it, it should give the error message. https://drive.google.com/drive/folders/1bkSaE0yI2YD0FvluygPCX3ZMYtrBk3MD?usp=sharing

I would appreciate very much if you help me to undersatnd what is wrong with my flow! Thanks a lot in advance! Dmitry

seurat_obj <- SetupForWGCNA(
  seurat_obj,
  gene_select = "variable", # the gene selection approach
  wgcna_name = "ver.2" # the name of the hdWGCNA experiment
)

seurat_obj <- MetacellsByGroups(
  seurat_obj,
  group.by = c("cl.PEP.split"), # specify the columns in seurat_obj@meta.data to group by
  k = 8, #10, # nearest-neighbors parameter, default = 25
  max_shared = 3, #2-works, #5, # maximum number of shared cells between two metacells, default = 15
  min_cells = 25, #25, #default = 100
  ident.group = "cl.PEP.split", # set the Idents of the metacell seurat object
  assay = 'integrated',
  slot = 'scale.data',
  verbose = T
)

seurat_obj <- NormalizeMetacells(seurat_obj)

names.ident.metacells <- names(table(seurat_obj@misc[["ver.2"]][["wgcna_metacell_obj"]]@meta.data[["orig.ident"]]))

seurat_obj <- SetDatExpr(
  seurat_obj,
  group_name = names.ident.metacells, # the name of the group of interest in the group.by column
  group.by="cl.PEP.split", #the metadata column containing the cell type info. This same column should have also been used in MetacellsByGroups
  assay = 'integrated',
  slot = 'data'
  )

# Skipped since takes very looong time
system.time(seurat_obj <- TestSoftPowers(
  seurat_obj))

par.split <- 4
par.cut <- 0.995
par.size <- 15

seurat_obj <- ConstructNetwork(
  seurat_obj, soft_power=16,
  deepSplit = par.split, #simplified sensitivity, default 2, 3 works well here.
  detectCutHeight = par.cut,
  minModuleSize = par.size,
  overwrite_tom = T,
  setDatExpr=FALSE
)

title.dendr <- paste("deepSplit=", par.split, ", detectCutHeight=",par.cut,
", minModuleSize=", par.size, sep = "")

#PlotDendrogram(seurat_obj, main='deepSpl=1, CutHeight=0.975, MinSize=15')
PlotDendrogram(seurat_obj, main = title.dendr)

R session info

sessionInfo() R version 4.2.1 (2022-06-23) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)

Matrix products: default BLAS: /sw/apps/R/4.2.1/rackham/lib64/R/lib/libRblas.so LAPACK: /sw/apps/R/4.2.1/rackham/lib64/R/lib/libRlapack.so

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] dplyr_1.1.3 plyr_1.8.8 gridExtra_2.3 hdWGCNA_0.2.23 harmony_1.0.1 Rcpp_1.0.10
[7] SeuratWrappers_0.3.1 WGCNA_1.72-1 foreach_1.5.2 backports_1.4.1 memoise_2.0.1 fastcluster_1.2.3
[13] dynamicTreeCut_1.63-1 Matrix_1.5-4.1 patchwork_1.1.2 cowplot_1.1.1 SeuratObject_4.1.3 Seurat_4.3.0
[19] tidyselect_1.2.0

loaded via a namespace (and not attached): [1] Hmisc_5.1-0 igraph_1.4.3 lazyeval_0.2.2 sp_1.6-0 splines_4.2.1 listenv_0.9.0
[7] scattermore_1.1 GenomeInfoDb_1.34.9 ggplot2_3.4.2 digest_0.6.31 htmltools_0.5.5 GO.db_3.16.0
[13] fansi_1.0.4 checkmate_2.2.0 magrittr_2.0.3 tensor_1.5 cluster_2.1.4 doParallel_1.0.17
[19] ROCR_1.0-11 remotes_2.4.2 globals_0.16.2 Biostrings_2.66.0 tester_0.1.7 matrixStats_0.63.0
[25] docopt_0.7.1 R.utils_2.12.2 spatstat.sparse_3.0-1 colorspace_2.1-0 blob_1.2.4 ggrepel_0.9.3
[31] xfun_0.39 sparsesvd_0.2-2 crayon_1.5.2 RCurl_1.98-1.12 jsonlite_1.8.4 progressr_0.13.0
[37] spatstat.data_3.0-1 impute_1.72.3 survival_3.5-5 zoo_1.8-12 iterators_1.0.14 glue_1.6.2
[43] polyclip_1.10-4 gtable_0.3.3 zlibbioc_1.44.0 XVector_0.38.0 leiden_0.4.3 future.apply_1.11.0
[49] BiocGenerics_0.44.0 abind_1.4-5 scales_1.2.1 DBI_1.1.3 spatstat.random_3.1-5 miniUI_0.1.1.1
[55] htmlTable_2.4.1 viridisLite_0.4.2 xtable_1.8-4 reticulate_1.28 rsvd_1.0.5 foreign_0.8-84
[61] bit_4.0.5 preprocessCore_1.60.2 Formula_1.2-5 stats4_4.2.1 htmlwidgets_1.6.2 httr_1.4.6
[67] FNN_1.1.3.2 RColorBrewer_1.1-3 ellipsis_0.3.2 ica_1.0-3 farver_2.1.1 R.methodsS3_1.8.2
[73] pkgconfig_2.0.3 nnet_7.3-19 uwot_0.1.14 deldir_1.0-9 utf8_1.2.3 labeling_0.4.3
[79] rlang_1.1.1 reshape2_1.4.4 later_1.3.1 AnnotationDbi_1.60.2 munsell_0.5.0 tools_4.2.1
[85] cachem_1.0.8 cli_3.6.1 generics_0.1.3 RSQLite_2.3.1 ggridges_0.5.4 evaluate_0.21
[91] stringr_1.5.0 fastmap_1.1.1 yaml_2.3.7 goftest_1.2-3 knitr_1.43 bit64_4.0.5
[97] fitdistrplus_1.1-11 purrr_1.0.1 RANN_2.6.1 KEGGREST_1.38.0 pbapply_1.7-0 future_1.32.0
[103] nlme_3.1-162 mime_0.12 slam_0.1-50 R.oo_1.25.0 compiler_4.2.1 rstudioapi_0.15.0
[109] plotly_4.10.1 png_0.1-8 spatstat.utils_3.0-3 tibble_3.2.1 stringi_1.7.12 lattice_0.20-45
[115] vctrs_0.6.2 pillar_1.9.0 lifecycle_1.0.3 BiocManager_1.30.19 spatstat.geom_3.2-1 lmtest_0.9-40
[121] RcppAnnoy_0.0.20 data.table_1.14.8 bitops_1.0-7 irlba_2.3.5.1 httpuv_1.6.11 R6_2.5.1
[127] promises_1.2.0.1 KernSmooth_2.23-20 IRanges_2.32.0 parallelly_1.36.0 codetools_0.2-18 MASS_7.3-60
[133] withr_2.5.0 qlcMatrix_0.9.7 sctransform_0.3.5 S4Vectors_0.36.2 GenomeInfoDbData_1.2.9 parallel_4.2.1
[139] rpart_4.1.19 grid_4.2.1 tidyr_1.3.0 rmarkdown_2.21 Rtsne_0.16 spatstat.explore_3.2-1 [145] Biobase_2.58.0 shiny_1.7.4 base64enc_0.1-3

Screenshots

dendr_wo_modules_QQ

Dimmiso commented 6 months ago

Hi, I have some additional comments. Dataset, which I provided as example, is my old dataset. It is not the one I interested to analyse, but it is small and allowed to reproduce the error. The actual dataset I am interested to analyse is by far larger and the story with that is following. When I run that dataset first time, the whole analysis looked very good.

Then I tried to modified parameters but from some moment started always getting that error message, even when I tried to use original, previously successful parameters.
It not only through the error message, but also the whole dendrogram looked very different with very few but long "branches". I believe that it was that which cased problem, because this error message coincide with very few (one or two) models are identified, which is expected with such dendrogram structure.

Below I attach both dendrograms. They were done on the same dataset with presumably the same parameters, but before and after "code failure". The second one misses module colour bar, coincide with error message while plotting (but one or two modules were in fact identified).

bild

bild

As I mentioned in my original post, this issue with error occurs not with all data sets. For example, I run yet another dataset, it was no error message, modules were identified, but the dendrogram still looks alerting to me, with "very low complexity" (low number of very long branches):

bild

Thanks a lot for great package, hdWGCNA, and support!

Dmitry

Dimmiso commented 6 months ago

Hi again, little update. Now, together my colleague, we run example code/object on three different machines, windows and linux. All three ended up with the same dendrogram and the same error message. Thus, this issue is quite reproducible across different sets of package versions with given object. As my colleague pointed, command TestSoftPowers() took unexpectedly long time with my test object. Indeed, when I run my full object with 45000 cells on 4 cores, TestSoftPowers() takes 1,5 h. Probably too long. Thanks a lot! Dmitry

smorabit commented 6 months ago

I have personally not come across this error. Right off the bat I can see something that is potentially wrong with your code.

seurat_obj <- MetacellsByGroups(
  seurat_obj,
  group.by = c("cl.PEP.split"), # specify the columns in seurat_obj@meta.data to group by
  k = 8, #10, # nearest-neighbors parameter, default = 25
  max_shared = 3, #2-works, #5, # maximum number of shared cells between two metacells, default = 15
  min_cells = 25, #25, #default = 100
  ident.group = "cl.PEP.split", # set the Idents of the metacell seurat object
  assay = 'integrated',
  slot = 'scale.data',
  verbose = T
)

I don't recommend using the scale.data slot here, maybe try this again with the counts slot?

Dimmiso commented 6 months ago

Hi Sam, thanks a lot for advice, it solved the problem! Thanks for developing and supporting this great package! Best, Dmitry