z0on / GO_MWU

Rank-based Gene Ontology analysis of gene expression data
37 stars 17 forks source link

wilcoxon test error: not enough y observations/cannot compute exact p-value with ties #13

Closed yaaminiv closed 1 year ago

yaaminiv commented 1 year ago

I'm running into the same error described in this issue for analysis of signed WGCNA modules:

Error in wilcox.test.default(nrg[sgo.yes], nrg[sgo.no], alternative = Alternative) : not enough 'y' observations
5. stop("not enough 'y' observations")
4. wilcox.test.default(nrg[sgo.yes], nrg[sgo.no], alternative = Alternative)
3. wilcox.test(nrg[sgo.yes], nrg[sgo.no], alternative = Alternative) at gomwu.functions.R#122
2. mwuTest(rsq.m, "g") at gomwu.functions.R#73
1. gomwuStats(input, goDatabase, goAnnotations, goDivision, perlPath = "perl", largest = 0.1, smallest = 5, clusterCutHeight = 0.25, Module = TRUE, Alternative = "g")

I re-cloned the latest repository and I'm still getting this error. I'm seeing this error for several two signed WGCNA modules (# genes in these two modules = 638 and 439).

There are also some modules where the test finishes running, but with the following warning:

Warning in wilcox.test.default(nrg[sgo.yes], nrg[sgo.no], alternative = Alternative) :
  cannot compute exact p-value with ties

My inputs files are below:

Is there a reason why I'm getting these errors?

z0on commented 1 year ago

Apologies for delay - Looks like you have unannotated genes in your GO annotations table (those with NA entries instead of GO terms in the second column); remove them - it should work then!

yaaminiv commented 1 year ago

@z0on Not resolved yet!

I removed the unannotated genes from my GO annotations table and I'm still getting the same error about not enough y observations for the grey60 and darkgrey modules. Module table of significance measures are in the original post, updated GO annotations table below:

GO-Annotations-Table-nonredundant.txt

Anny suggestions? I'm also still getting the warning about p-value computation with ties.

z0on commented 1 year ago

Hmm, it works for me tho, with this file yaaminGO_annotations.txt

yaaminiv commented 1 year ago

I still get the same error about not enough y observations using the file you sent over. Could it be an issue with library/R versions? I'm using perl v5.30.3 built for darwin-thread-multi-2level.

My R session information is below:

R version 4.2.1 (2022-06-23)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Ventura 13.0.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] grid      stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] ape_5.6-2                   cowplot_1.1.1              
 [3] patchwork_1.1.2             WGCNA_1.71                 
 [5] fastcluster_1.2.3           dynamicTreeCut_1.63-1      
 [7] DESeq2_1.36.0               SummarizedExperiment_1.26.1
 [9] Biobase_2.56.0              MatrixGenerics_1.8.1       
[11] matrixStats_0.62.0          GenomicRanges_1.48.0       
[13] GenomeInfoDb_1.32.4         IRanges_2.30.1             
[15] S4Vectors_0.34.0            BiocGenerics_0.42.0        
[17] RColorBrewer_1.1-3          forcats_0.5.2              
[19] stringr_1.4.1               dplyr_1.0.10               
[21] purrr_0.3.5                 readr_2.1.3                
[23] tidyr_1.2.1                 tibble_3.1.8               
[25] ggplot2_3.4.0               tidyverse_1.3.2            

loaded via a namespace (and not attached):
  [1] googledrive_2.0.0      colorspace_2.0-3       deldir_1.0-6          
  [4] ellipsis_0.3.2         htmlTable_2.4.1        XVector_0.36.0        
  [7] base64enc_0.1-3        fs_1.5.2               rstudioapi_0.14       
 [10] bit64_4.0.5            AnnotationDbi_1.58.0   fansi_1.0.3           
 [13] lubridate_1.8.0        xml2_1.3.3             codetools_0.2-18      
 [16] splines_4.2.1          doParallel_1.0.17      impute_1.70.0         
 [19] cachem_1.0.6           geneplotter_1.74.0     knitr_1.40            
 [22] Formula_1.2-4          jsonlite_1.8.3         broom_1.0.1           
 [25] annotate_1.74.0        cluster_2.1.4          GO.db_3.15.0          
 [28] dbplyr_2.2.1           png_0.1-7              compiler_4.2.1        
 [31] httr_1.4.4             backports_1.4.1        assertthat_0.2.1      
 [34] Matrix_1.5-1           fastmap_1.1.0          gargle_1.2.1          
 [37] cli_3.4.1              htmltools_0.5.3        tools_4.2.1           
 [40] gtable_0.3.1           glue_1.6.2             GenomeInfoDbData_1.2.8
 [43] Rcpp_1.0.9             cellranger_1.1.0       vctrs_0.5.0           
 [46] Biostrings_2.64.1      nlme_3.1-160           preprocessCore_1.58.0 
 [49] iterators_1.0.14       xfun_0.34              rvest_1.0.3           
 [52] lifecycle_1.0.3        XML_3.99-0.12          googlesheets4_1.0.1   
 [55] zlibbioc_1.42.0        scales_1.2.1           hms_1.1.2             
 [58] parallel_4.2.1         yaml_2.3.6             gridExtra_2.3         
 [61] memoise_2.0.1          rpart_4.1.19           latticeExtra_0.6-30   
 [64] stringi_1.7.8          RSQLite_2.2.18         genefilter_1.78.0     
 [67] foreach_1.5.2          checkmate_2.1.0        BiocParallel_1.30.4   
 [70] rlang_1.0.6            pkgconfig_2.0.3        bitops_1.0-7          
 [73] evaluate_0.18          lattice_0.20-45        htmlwidgets_1.5.4     
 [76] bit_4.0.4              tidyselect_1.2.0       magrittr_2.0.3        
 [79] R6_2.5.1               generics_0.1.3         Hmisc_4.7-1           
 [82] DelayedArray_0.22.0    DBI_1.1.3              foreign_0.8-83        
 [85] pillar_1.8.1           haven_2.5.1            withr_2.5.0           
 [88] nnet_7.3-18            survival_3.4-0         KEGGREST_1.36.3       
 [91] RCurl_1.98-1.9         modelr_0.1.9           crayon_1.5.2          
 [94] interp_1.1-3           utf8_1.2.2             tzdb_0.3.0            
 [97] rmarkdown_2.17         jpeg_0.1-9             locfit_1.5-9.6        
[100] readxl_1.4.1           data.table_1.14.4      blob_1.2.3            
[103] reprex_2.0.2           digest_0.6.30          xtable_1.8-4          
[106] munsell_0.5.0  
yaaminiv commented 1 year ago

@z0on I also noticed the file you shared had genes listed in a different order to the nonredundant file I sent over. Is there a difference in how you processed the data that would lead to a different gene order? Perhaps not the most pressing issue, but I want to make sure I didn't miss a step

z0on commented 1 year ago

Aha! I found another issue with your annotations file: GO terms are separated with semicolon followed by space; should be just semicolon. Here is the correct version.... (The order of genes seems to match the original one, just omitting genes with NA in the second column.)

yaaminGO_annotations.txt

yaaminiv commented 1 year ago

Removing the spaces worked!