smorabit / hdWGCNA

High dimensional weighted gene co-expression network analysis
https://smorabit.github.io/hdWGCNA/
Other
315 stars 31 forks source link

Question about module preservation with bulk RNA-seq dataset #222

Closed zitiansunshine closed 1 month ago

zitiansunshine commented 2 months ago

Dear Sam,

Hi, thank you very much for developing this amazing tool. I identified modules from a single cell RNA-seq dataset, and I am interested in checking whether the module is preserved in a bulk RNA-seq dataset from the same tissue. I want to ask whether this is possible with the implementation in hdWGCNA? Thank you very much.

smorabit commented 2 months ago

Hi,

I want to ask whether this is possible with the implementation in hdWGCNA?

This is not possible with hdWGCNA directly (unless you format your bulk data as a Seurat object), since this package is not built to be used with bulk transcriptomic datasets. Fortunately the original WGCNA package is built for bulk datasets and you should be able to use it. I can provide some code which may help but you will probably need to make some modifications for your particular data.

You are going to want to look at this function from WGCNA, and you will need to supply it with your two expression matrices and the set of gene modules. You can get the expression matrix from the hdWGCNA metacells like this:

datExpr <- GetDatExpr(seurat_obj)

Importantly, you need to make sure that the genes in the bulk RNA dataset match the single-cell dataset.


# suppose you have a samples x genes bulk RNAseq dataset called bulk_df 
head(bulk_df) 

# get the genes that are in common: 
genes_bulk <- colnames(bulk_df)
genes_sc <- colnames(datExpr)
genes_keep <- genes_bulk[genes_bulk %in% genes_sc]
bulk_df <- bulk_df[,genes_keep]

Next set up both expression matrices for the modulePreservation WGCNA function.

# set up multiExpr:
setLabels <- c("ref", "query")
multiExpr <- list(
  ref = list(data=datExpr_ref),
  query = list(data=datExpr_query)
)

# set up the modules
ref_modules <- list(ref = GetModules(seurat_obj)$module)

Run the modulePreservation function

mp <- WGCNA::modulePreservation(
    multiExpr,
    ref_modules,
    referenceNetworks = 1,
    nPermutations = 100 # set this to whatever number is suitable for you
  )

Note that I did not run this code and I am not sure if it will work, but I hope that it will help you get started!

zitiansunshine commented 2 months ago

Thank you very much for all of your help Sam. I truly appreciated for your detailed and thorough responses.

I think there may be a bug with hdWGCNA::modulePreservation() function in v0.2.27: I am not able to run the function and it always returns me the error: "Error in multiColor[[s]] : subscript out of bounds". There is no problem in projecting the modules and setting up the expression data for seurat_obj and seurat_ref.

I also experimented with v0.2.26, and hdWGCNA::modulePreservation() works fine. This is my sessionInfo:

`R version 4.2.2 (2022-10-31) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 22.04.3 LTS

Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] patchwork_1.1.2 dplyr_1.1.2 celda_1.14.2
[4] SingleCellExperiment_1.20.1 celldex_1.8.0 SummarizedExperiment_1.28.0 [7] Biobase_2.58.0 GenomicRanges_1.50.2 GenomeInfoDb_1.34.9
[10] IRanges_2.32.0 S4Vectors_0.36.2 BiocGenerics_0.44.0
[13] MatrixGenerics_1.10.0 matrixStats_0.63.0 Matrix_1.5-4
[16] hdWGCNA_0.2.26 ggrepel_0.9.3 ggplot2_3.4.2
[19] SeuratObject_4.1.3 Seurat_4.3.0 harmony_1.2.0
[22] Rcpp_1.0.10 diem_2.4.1

loaded via a namespace (and not attached): [1] utf8_1.2.3 spatstat.explore_3.1-0
[3] reticulate_1.28 tidyselect_1.2.0
[5] RSQLite_2.3.1 AnnotationDbi_1.60.2
[7] htmlwidgets_1.6.2 combinat_0.0-8
[9] grid_4.2.2 Rtsne_0.16
[11] munsell_0.5.0 codetools_0.2-19
[13] ica_1.0-3 preprocessCore_1.60.2
[15] future_1.32.0 miniUI_0.1.1.1
[17] withr_2.5.0 tester_0.1.7
[19] spatstat.random_3.1-4 colorspace_2.1-0
[21] progressr_0.13.0 filelock_1.0.2
[23] knitr_1.42 rstudioapi_0.14
[25] ROCR_1.0-11 assertive.base_0.0-9
[27] tensor_1.5 listenv_0.9.0
[29] labeling_0.4.2 GenomeInfoDbData_1.2.9
[31] polyclip_1.10-4 farver_2.1.1
[33] bit64_4.0.5 parallelly_1.35.0
[35] vctrs_0.6.2 generics_0.1.3
[37] xfun_0.39 BiocFileCache_2.6.1
[39] fastcluster_1.2.3 R6_2.5.1
[41] doParallel_1.0.17 RcppEigen_0.3.3.9.3
[43] gridGraphics_0.5-1 bitops_1.0-7
[45] spatstat.utils_3.0-3 cachem_1.0.8
[47] DelayedArray_0.24.0 promises_1.2.0.1
[49] scales_1.2.1 nnet_7.3-18
[51] gtable_0.3.3 globals_0.16.2
[53] goftest_1.2-3 WGCNA_1.72-1
[55] rlang_1.1.1 splines_4.2.2
[57] lazyeval_0.2.2 impute_1.72.3
[59] spatstat.geom_3.2-1 checkmate_2.2.0
[61] BiocManager_1.30.20 yaml_2.3.7
[63] reshape2_1.4.4 abind_1.4-5
[65] backports_1.4.1 httpuv_1.6.10
[67] Hmisc_5.1-0 tools_4.2.2
[69] ellipsis_0.3.2 RColorBrewer_1.1-3
[71] dynamicTreeCut_1.63-1 ggridges_0.5.4
[73] plyr_1.8.8 sparseMatrixStats_1.10.0
[75] base64enc_0.1-3 zlibbioc_1.44.0
[77] purrr_1.0.1 RCurl_1.98-1.12
[79] rpart_4.1.19 deldir_1.0-6
[81] pbapply_1.7-0 cowplot_1.1.1
[83] zoo_1.8-12 cluster_2.1.4
[85] magrittr_2.0.3 magick_2.7.4
[87] data.table_1.14.8 scattermore_1.0
[89] lmtest_0.9-40 RANN_2.6.1
[91] fitdistrplus_1.1-11 mime_0.12
[93] evaluate_0.21 xtable_1.8-4
[95] gridExtra_2.3 compiler_4.2.2
[97] tibble_3.2.1 KernSmooth_2.23-20
[99] crayon_1.5.2 htmltools_0.5.5
[101] later_1.3.1 Formula_1.2-5
[103] tidyr_1.3.0 MCMCprecision_0.4.0
[105] DBI_1.1.3 ExperimentHub_2.6.0
[107] assertive.files_0.0-2 WriteXLS_6.4.0
[109] dbplyr_2.3.2 assertive.numbers_0.0-2
[111] MASS_7.3-59 rappdirs_0.3.3
[113] cli_3.6.1 assertive.types_0.0-3
[115] parallel_4.2.2 igraph_1.4.2
[117] pkgconfig_2.0.3 foreign_0.8-84
[119] sp_1.6-0 plotly_4.10.1
[121] spatstat.sparse_3.0-1 foreach_1.5.2
[123] XVector_0.38.0 stringr_1.5.0
[125] digest_0.6.31 sctransform_0.3.5
[127] RcppAnnoy_0.0.20 spatstat.data_3.0-1
[129] Biostrings_2.66.0 enrichR_3.2
[131] rmarkdown_2.21 leiden_0.4.3
[133] htmlTable_2.4.1 uwot_0.1.14
[135] DelayedMatrixStats_1.20.0 curl_5.0.0
[137] shiny_1.7.4 rjson_0.2.21
[139] lifecycle_1.0.3 nlme_3.1-162
[141] jsonlite_1.8.4 viridisLite_0.4.2
[143] fansi_1.0.4 pillar_1.9.0
[145] lattice_0.20-45 KEGGREST_1.38.0
[147] fastmap_1.1.1 httr_1.4.6
[149] survival_3.5-5 GO.db_3.16.0
[151] interactiveDisplayBase_1.36.0 glue_1.6.2
[153] multipanelfigure_2.1.2 png_0.1-8
[155] iterators_1.0.14 BiocVersion_3.16.0
[157] bit_4.0.5 assertive.properties_0.0-5
[159] stringi_1.7.12 blob_1.2.4
[161] AnnotationHub_3.6.0 memoise_2.0.1
[163] irlba_2.3.5.1 future.apply_1.10.0 `

Again, thank you very much!

smorabit commented 2 months ago

I don't think there was any change in ModulePreservation between those versions so that is very strange! Could you please try again on the latest version?

smorabit commented 1 month ago

Closing due to inactivity.