smorabit / hdWGCNA

High dimensional weighted gene co-expression network analysis
https://smorabit.github.io/hdWGCNA/
Other
359 stars 35 forks source link

Error in if (!check) when running Enrichr enrichment analysis #324

Closed jijjigae closed 1 week ago

jijjigae commented 2 weeks ago

Recurring unsolvable issue

This is an issue that I've tried for over three days to resolve unsuccessfully so I'm hoping someone with more experience with this can help me out!

Basically, when running EnrichR with hdWGCNA (following the basic code in the tutorial here), when trying function RunEnrichr I always get the same output: Error in if (!check) { : argument is of length zero. Using the traceback() function reveals that the error occurs in CheckWGCNAName(seurat_obj, wgcna_name), which I know has supposedly been fixed as of a short time ago.

Knowing this, I have performed the below steps with no success:

  1. Completely reinstalled hdWGCNA and WGCNA from both CRAN and Github (more than 10 times over a period of several days)
  2. Reinstalled dependent packages
  3. Checked I have the latest version of R
  4. Read the entirety of any even remotely related Github issue to attempt to troubleshoot the problem I actually just learned R last week, so other than following tutorials and logical problem solving I have no experience with this.

I'm copying the most recent attempt with this code below:

[Workspace loaded from ~/.RData]

> # single-cell analysis package
Warning messages:
1: replacing previous import ‘GenomicRanges::intersect’ by ‘SeuratObject::intersect’ when loading ‘hdWGCNA’ 
2: replacing previous import ‘GenomicRanges::union’ by ‘dplyr::union’ when loading ‘hdWGCNA’ 
3: replacing previous import ‘GenomicRanges::setdiff’ by ‘dplyr::setdiff’ when loading ‘hdWGCNA’ 
4: replacing previous import ‘dplyr::as_data_frame’ by ‘igraph::as_data_frame’ when loading ‘hdWGCNA’ 
5: replacing previous import ‘Seurat::components’ by ‘igraph::components’ when loading ‘hdWGCNA’ 
6: replacing previous import ‘dplyr::groups’ by ‘igraph::groups’ when loading ‘hdWGCNA’ 
7: replacing previous import ‘dplyr::union’ by ‘igraph::union’ when loading ‘hdWGCNA’ 
8: replacing previous import ‘GenomicRanges::subtract’ by ‘magrittr::subtract’ when loading ‘hdWGCNA’ 
9: replacing previous import ‘Matrix::as.matrix’ by ‘proxy::as.matrix’ when loading ‘hdWGCNA’ 
10: replacing previous import ‘igraph::groups’ by ‘tidygraph::groups’ when loading ‘hdWGCNA’ 
> library(Seurat)
Loading required package: SeuratObject
Loading required package: sp
‘SeuratObject’ was built under R 4.4.0 but the current version is
4.4.1; it is recomended that you reinstall ‘SeuratObject’ as the ABI
for R may have changed
‘SeuratObject’ was built with package ‘Matrix’ 1.7.0 but the current
version is 1.7.1; it is recomended that you reinstall ‘SeuratObject’ as
the ABI for ‘Matrix’ may have changed

Attaching package: ‘SeuratObject’

The following objects are masked from ‘package:base’:

    intersect, t

> 
> # plotting and data science packages
> library(tidyverse)
── Attaching core tidyverse packages ───────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ─────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package to force all conflicts to become errors
> library(cowplot)

Attaching package: ‘cowplot’

The following object is masked from ‘package:lubridate’:

    stamp

> library(patchwork)

Attaching package: ‘patchwork’

The following object is masked from ‘package:cowplot’:

    align_plots

> 
> # co-expression network analysis packages:
> library(WGCNA)
Loading required package: dynamicTreeCut
Loading required package: fastcluster

Attaching package: ‘fastcluster’

The following object is masked from ‘package:stats’:

    hclust

Attaching package: ‘WGCNA’

The following object is masked from ‘package:stats’:

    cor

> library(hdWGCNA)
Loading required package: harmony
Loading required package: Rcpp
Loading required package: ggrepel
Loading required package: igraph

Attaching package: ‘igraph’

The following objects are masked from ‘package:lubridate’:

    %--%, union

The following objects are masked from ‘package:dplyr’:

    as_data_frame, groups, union

The following objects are masked from ‘package:purrr’:

    compose, simplify

The following object is masked from ‘package:tidyr’:

    crossing

The following object is masked from ‘package:tibble’:

    as_data_frame

The following object is masked from ‘package:Seurat’:

    components

The following objects are masked from ‘package:stats’:

    decompose, spectrum

The following object is masked from ‘package:base’:

    union

Loading required package: ggraph

Attaching package: ‘ggraph’

The following object is masked from ‘package:sp’:

    geometry

Loading required package: tidygraph

Attaching package: ‘tidygraph’

The following object is masked from ‘package:igraph’:

    groups

The following object is masked from ‘package:stats’:

    filter

Loading required package: UCell
Loading required package: GeneOverlap
Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:igraph’:

    normalize, path, union

The following objects are masked from ‘package:lubridate’:

    intersect, setdiff, union

The following objects are masked from ‘package:dplyr’:

    combine, intersect, setdiff, union

The following object is masked from ‘package:SeuratObject’:

    intersect

The following objects are masked from ‘package:stats’:

    IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, aperm, append, as.data.frame, basename, cbind,
    colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
    get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
    Position, rank, rbind, Reduce, rownames, sapply, setdiff, table,
    tapply, union, unique, unsplit, which.max, which.min

Loading required package: S4Vectors

Attaching package: ‘S4Vectors’

The following objects are masked from ‘package:tidygraph’:

    active, rename

The following objects are masked from ‘package:lubridate’:

    second, second<-

The following objects are masked from ‘package:dplyr’:

    first, rename

The following object is masked from ‘package:tidyr’:

    expand

The following object is masked from ‘package:utils’:

    findMatches

The following objects are masked from ‘package:base’:

    expand.grid, I, unname

Loading required package: IRanges

Attaching package: ‘IRanges’

The following object is masked from ‘package:tidygraph’:

    slice

The following object is masked from ‘package:lubridate’:

    %within%

The following objects are masked from ‘package:dplyr’:

    collapse, desc, slice

The following object is masked from ‘package:purrr’:

    reduce

The following object is masked from ‘package:sp’:

    %over%

Loading required package: GenomeInfoDb
> 
> # gene enrichment packages
> library(enrichR)
Welcome to enrichR
Checking connection ... 
Enrichr ... Connection is Live!
FlyEnrichr ... Connection is Live!
WormEnrichr ... Connection is Live!
YeastEnrichr ... Connection is Live!
FishEnrichr ... Connection is Live!
OxEnrichr ... Connection is Live!
> library(GeneOverlap)
> 
> # using the cowplot theme for ggplot
> theme_set(theme_cowplot())
> 
> # set random seed for reproducibility
> set.seed(12345)
> 
> # load the Zhou et al snRNA-seq dataset
> seurat_obj <- readRDS("/Users/cinnamonsoup/Trem2AD.combined.rds")
> seurat_obj <- RunEnrichr(
+     seurat_obj,
+     dbs=dbs, # character vector of enrichr databases to test
+     max_genes = 100 # number of genes per module to test. use max_genes = Inf to choose all genes!
+ )
Error in if (!check) { : argument is of length zero
> traceback()
2: CheckWGCNAName(seurat_obj, wgcna_name)
1: RunEnrichr(seurat_obj, dbs = dbs, max_genes = 100)

The original code from the tutorial is as below:

# single-cell analysis package
library(Seurat)

# plotting and data science packages
library(tidyverse)
library(cowplot)
library(patchwork)

# co-expression network analysis packages:
library(WGCNA)
library(hdWGCNA)

# gene enrichment packages
library(enrichR)
library(GeneOverlap)

# using the cowplot theme for ggplot
theme_set(theme_cowplot())

# set random seed for reproducibility
set.seed(12345)

# load the Zhou et al snRNA-seq dataset
seurat_obj <- readRDS('data/Zhou_control.rds')' `

Up until here, there is no issue. Then, replacing my own .rds file with the data in the final input, I continue with this input and receive the error:


`# enrichr databases to test
dbs <- c('GO_Biological_Process_2021','GO_Cellular_Component_2021','GO_Molecular_Function_2021')

# perform enrichment tests
seurat_obj <- RunEnrichr(
  seurat_obj,
  dbs=dbs, # character vector of enrichr databases to test
  max_genes = 100 # number of genes per module to test. use max_genes = Inf to choose all genes!
)
Error in if (!check) { : argument is of length zero

The next part of the code I tried to run anyways with no success, either. I receive the same message.

# retrieve the output table
enrich_df <- GetEnrichrTable(seurat_obj)
**R session info**
> sessionInfo()
R version 4.4.1 (2024-06-14)
Platform: x86_64-apple-darwin20
Running under: macOS 15.0.1

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Asia/Seoul
tzcode source: internal

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] enrichR_3.2           hdWGCNA_0.4.00        GenomicRanges_1.56.2 
 [4] GenomeInfoDb_1.40.1   IRanges_2.38.1        S4Vectors_0.42.1     
 [7] BiocGenerics_0.50.0   GeneOverlap_1.40.0    UCell_2.8.0          
[10] tidygraph_1.3.1       ggraph_2.2.1          igraph_2.1.1         
[13] ggrepel_0.9.6         harmony_1.2.1         Rcpp_1.0.13          
[16] WGCNA_1.73            fastcluster_1.2.6     dynamicTreeCut_1.63-1
[19] patchwork_1.3.0       cowplot_1.1.3         lubridate_1.9.3      
[22] forcats_1.0.0         stringr_1.5.1         dplyr_1.1.4          
[25] purrr_1.0.2           readr_2.1.5           tidyr_1.3.1          
[28] tibble_3.2.1          ggplot2_3.5.1         tidyverse_2.0.0      
[31] Seurat_5.1.0          SeuratObject_5.0.2    sp_2.1-4             

loaded via a namespace (and not attached):
  [1] RcppAnnoy_0.0.22            splines_4.4.1              
  [3] later_1.3.2                 bitops_1.0-9               
  [5] polyclip_1.10-7             preprocessCore_1.66.0      
  [7] rpart_4.1.23                fastDummies_1.7.4          
  [9] lifecycle_1.0.4             doParallel_1.0.17          
 [11] globals_0.16.3              lattice_0.22-6             
 [13] MASS_7.3-61                 backports_1.5.0            
 [15] magrittr_2.0.3              rmarkdown_2.28             
 [17] Hmisc_5.1-3                 plotly_4.10.4              
 [19] httpuv_1.6.15               sctransform_0.4.1          
 [21] spam_2.11-0                 spatstat.sparse_3.1-0      
 [23] reticulate_1.39.0           pbapply_1.7-2              
 [25] DBI_1.2.3                   RColorBrewer_1.1-3         
 [27] abind_1.4-8                 zlibbioc_1.50.0            
 [29] Rtsne_0.17                  WriteXLS_6.7.0             
 [31] nnet_7.3-19                 tweenr_2.0.3               
 [33] GenomeInfoDbData_1.2.12     irlba_2.3.5.1              
 [35] listenv_0.9.1               spatstat.utils_3.1-0       
 [37] goftest_1.2-3               RSpectra_0.16-2            
 [39] spatstat.random_3.3-2       fitdistrplus_1.2-1         
 [41] parallelly_1.38.0           leiden_0.4.3.1             
 [43] codetools_0.2-20            DelayedArray_0.30.1        
 [45] ggforce_0.4.2               tidyselect_1.2.1           
 [47] UCSC.utils_1.0.0            farver_2.1.2               
 [49] tester_0.2.0                viridis_0.6.5              
 [51] matrixStats_1.4.1           base64enc_0.1-3            
 [53] spatstat.explore_3.3-3      jsonlite_1.8.9             
 [55] BiocNeighbors_1.22.0        progressr_0.14.0           
 [57] Formula_1.2-5               ggridges_0.5.6             
 [59] survival_3.7-0              iterators_1.0.14           
 [61] foreach_1.5.2               tools_4.4.1                
 [63] ica_1.0-3                   glue_1.8.0                 
 [65] gridExtra_2.3               SparseArray_1.4.8          
 [67] xfun_0.48                   MatrixGenerics_1.16.0      
 [69] withr_3.0.1                 fastmap_1.2.0              
 [71] fansi_1.0.6                 caTools_1.18.3             
 [73] digest_0.6.37               timechange_0.3.0           
 [75] R6_2.5.1                    mime_0.12                  
 [77] colorspace_2.1-1            scattermore_1.2            
 [79] GO.db_3.19.1                gtools_3.9.5               
 [81] tensor_1.5                  spatstat.data_3.1-2        
 [83] RSQLite_2.3.7               utf8_1.2.4                 
 [85] generics_0.1.3              data.table_1.16.2          
 [87] graphlayouts_1.2.0          httr_1.4.7                 
 [89] htmlwidgets_1.6.4           S4Arrays_1.4.1             
 [91] uwot_0.2.2                  pkgconfig_2.0.3            
 [93] gtable_0.3.6                blob_1.2.4                 
 [95] impute_1.78.0               lmtest_0.9-40              
 [97] SingleCellExperiment_1.26.0 XVector_0.44.0             
 [99] htmltools_0.5.8.1           dotCall64_1.2              
[101] scales_1.3.0                Biobase_2.64.0             
[103] png_0.1-8                   spatstat.univar_3.0-1      
[105] knitr_1.48                  rstudioapi_0.17.1          
[107] rjson_0.2.23                tzdb_0.4.0                 
[109] reshape2_1.4.4              curl_5.2.3                 
[111] checkmate_2.3.2             nlme_3.1-166               
[113] proxy_0.4-27                zoo_1.8-12                 
[115] cachem_1.1.0                KernSmooth_2.23-24         
[117] parallel_4.4.1              miniUI_0.1.1.1             
[119] foreign_0.8-87              AnnotationDbi_1.66.0       
[121] pillar_1.9.0                grid_4.4.1                 
[123] vctrs_0.6.5                 gplots_3.2.0               
[125] RANN_2.6.2                  promises_1.3.0             
[127] xtable_1.8-4                cluster_2.1.6              
[129] htmlTable_2.4.3             evaluate_1.0.1             
[131] cli_3.6.3                   compiler_4.4.1             
[133] rlang_1.1.4                 crayon_1.5.3               
[135] future.apply_1.11.2         plyr_1.8.9                 
[137] stringi_1.8.4               viridisLite_0.4.2          
[139] deldir_2.0-4                BiocParallel_1.38.0        
[141] munsell_0.5.1               Biostrings_2.72.1          
[143] lazyeval_0.2.2              spatstat.geom_3.3-3        
[145] Matrix_1.7-1                RcppHNSW_0.6.0             
[147] hms_1.1.3                   bit64_4.5.2                
[149] future_1.34.0               KEGGREST_1.44.1            
[151] shiny_1.9.1                 SummarizedExperiment_1.34.0
[153] ROCR_1.0-11                 memoise_2.0.1              
[155] bit_4.5.0    

I am also aware that a similar issue has recently been solved in the newest release but no matter how many times I attempt a reinstall the issue is not solvable with the extent of my knowledge! As a side note, I was able to analyze the same Seurat object with just hdWGCNA, without running EnrichR.

Thanks in advance for any help you can offer, much appreciated!

smorabit commented 2 weeks ago

Hi, I am not able to recreate your error with the latest version of hdWGCNA, but I am looking at the behavior of the function CheckWGCNAName. I am able to get the same error message if I run this:

CheckWGCNAName(seurat_obj, wgcna_name=NULL)

However, in the code for RunEnrichr, wgcna_name should not be assigned to NULL. This is the code from RunEnrichr:

if(is.null(wgcna_name)){wgcna_name <- seurat_obj@misc$active_wgcna}
CheckWGCNAName(seurat_obj, wgcna_name)

I can't duplicate this behavior when running RunEnrichr, and I can't see where in the code it would be going wrong.

I wonder if somehow your active_wgcna has been set to NULL? Try print(seurat_obj@misc$active_wgcna). If this is NULL, you need to re-name it.

jijjigae commented 2 weeks ago

Hi there, thanks so much for your help. Yep, that indeed is the output of the function print(seurat_obj@misc$active_wgcna), I get NULL.

This is probably a very basic question, but how exactly do I rename it?

I previously didn't realize it had been set to NULL because running CheckWGCNAName() returns this error:

Error in CheckWGCNAName() : 
  argument "wgcna_name" is missing, with no default
smorabit commented 2 weeks ago

As shown in the tutorial, at the very beginning of the hdWGCNA pipeline you run SetupForWGCNA, and in this function you specify the wgcna_name.

I am not sure what you used for wgcna_name, but for example in the tutorial we used wgcna_name = "tutorial". In this example, you could do this:

seurat_obj <- SetActiveWGCNA(seurat_obj, "tutorial")

# this should now say "tutorial"
print(seurat_obj@misc$active_wgcna)

I hope that this helps you fix your problem. But I am confused how or why this was set to NULL in the first place...

jijjigae commented 2 weeks ago

Thanks for your help. It did let me set the name to something else, but afterwards I tried to run Enrichr according to the code in the tutorial and I was met with this error:

> seurat_obj <- RunEnrichr(
+     seurat_obj,
+     dbs=dbs, # character vector of enrichr databases to test
+     max_genes = 100 # number of genes per module to test. use max_genes = Inf to choose all genes!
+ )
Error in RunEnrichr(seurat_obj, dbs = dbs, max_genes = 100) : 
  object 'module' not found

Adding more arguments, including the wgcna_name, returned this error:

> RunEnrichr(seurat_obj, dbs=dbs, max_genes = 100, wait = TRUE, wait_time=5, wgcna_name = Trem2AD.combined)
Error in match(x, table, nomatch = 0L) : 
  'match' requires vector arguments

I checked and the Seurat object doesn't appear to have any issues, as mentioned previously running other hdWGCNA commands other than EnrichR work fine.

Oh, and as for dbs potentially requiring vector arguments, I had already set dbs to my preferred databases, and verified that it works:

> dbs
[1] "GO_Molecular_Function_2023"      
[2] "GO_Molecular_Component_2023"     
[3] "Kinase_Perturbations_from_GEO_up"
[4] "Rummagene_kinases"  

Other than the dbs argument I can't manage to figure out what else would require vector arguments which isn't already specified...

smorabit commented 2 weeks ago

Are you sure that you are using the correct wgcna_name which you used previously?

jijjigae commented 2 weeks ago

Yes, and I tried it several times as well. Even when reloading the seurat_object and renaming it from the beginning it brings up the Error in RunEnrichr(seurat_obj, dbs = dbs, max_genes = 100) : object 'module' not found

smorabit commented 2 weeks ago

I am not sure how this happened but it seems like this wgcna_name does not have any modules, which means that you have probably not run ConstructNetwork.

GetModules(seurat_obj, wgcna_name)

Does this show you NULL?

Please ensure that you have run the main steps of hdWGCNA as outlined in the tutorial.

jijjigae commented 2 weeks ago

I re-loaded all libraries according to the tutorial and ran SetupForWGCNA, renaming the WGCNA name as we discussed previously, however I got this error once more:

> GetModules(seurat_obj, Trem2AD.combined)
Error in match(x, table, nomatch = 0L) : 
  'match' requires vector arguments

The exact code that I ran, with outputs, is below:

library(hdWGCNA)
> # single-cell analysis package
> library(Seurat)
> 
> # plotting and data science packages
> library(tidyverse)
> library(cowplot)
> library(patchwork)
> 
> # co-expression network analysis packages:
> library(WGCNA)
> library(hdWGCNA)
> 
> # using the cowplot theme for ggplot
> theme_set(theme_cowplot())
> 
> # set random seed for reproducibility
> set.seed(12345)
> 
> # optionally enable multithreading
> enableWGCNAThreads(nThreads = 8)
Warning in allowWGCNAThreads: Requested number of threads is higher than number of available processors (or cores). Using too many threads may degrade code performance. It is recommended that the number of threads is no more than number
 of available processors.

Allowing parallel execution with up to 8 working processes.
> 
> # load the dataset
> seurat_obj <- readRDS("/Users/cinnamonsoup/Trem2AD.combined.rds")
> seurat_obj <- SetupForWGCNA(seurat_obj, gene_select = "fraction", # the gene selection approach fraction = 0.05, # fraction of cells that a gene needs to be expressed in order to be included
wgcna_name = "Trem2AD.combined" # the name of the hdWGCNA experiment
 )
 > GetModules(seurat_obj, Trem2AD.combined)
Error in match(x, table, nomatch = 0L) : 
  'match' requires vector arguments
jijjigae commented 1 week ago

Reading your comment, I realized that I had no idea I had to go through the entire hdWGCNA tutorial before running EnrichR analysis, as you can see from the code above I figured for some reason that just running SetupforWGCNA was sufficient.

I figured it was an obvious error.. turns out it was, lol.

I went back and ran the entire hdWGCNA tutorial from the beginning before running EnrichR and it worked; I was able to get outputs of GO analysis for all modules.

Thanks so much for taking the time to reply!