lgeistlinger / EnrichmentBrowser

Seamless navigation through combined results of set-based and network-based enrichment analysis
20 stars 11 forks source link

error in eaBrowse when there are no significant gene sets #13

Closed olkap closed 6 years ago

olkap commented 6 years ago

The eaBrowse function generates an uninformative error, when there are no significant gene sets and nr.show is set to -1. Below is the sample code:

#_________________
# Load packages
library("EnrichmentBrowser")
library("ALL")
library("hgu95av2.db")

#________________
# Load the ALL dataset
data(ALL)

#________________
#select B-cell ALL patients with and without the BCR/ABL fusion
ind.bs <- grep("^B", ALL$BT)  
ind.mut <- which(ALL$mol.biol %in% c("BCR/ABL", "NEG"))
sset <- intersect(ind.bs, ind.mut)
all.eset <- ALL[, sset]

#________________
#We compute gene expression values as the average of the corresponding probe values.
allSE <- probe2gene(all.eset) 

#______________________
# Create GROUP variable (the GROUP variable indicates whether the BCR-ABL gene fusion is present (1) or not (0).)
allSE$GROUP <- ifelse(allSE$mol.biol == "BCR/ABL", 1, 0)

#____________________
#differential expression analysis
allSE <- deAna(allSE)

#___________________
#obtain gene sets
gs_kegg <- getGenesets(org="hsa", db="kegg")

#_________________
#Perform ora analysis with fdr adjustment
kegg.res.fdr <- sbea(method="ora", se=allSE, gs=gs_kegg, padj.method = "fdr")

#_________________
#Return 5 significant gene sets
eaBrowse(kegg.res.fdr, nr.show = 5)
#IT WORKS

#_________________
#Return all significant gene sets
eaBrowse(kegg.res.fdr)
#IT DOES NOT WORK

The message that appears after running eaBrowse:

Error in if (substring(gs.id, 1, 3) == "GO:") return("GO") else if (grepl("^[a-z]{3}[0-9]{5}",  : 
  missing value where TRUE/FALSE needed

sessionInfo():

R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252    LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C                            LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] pathview_1.20.0             bindrcpp_0.2.2              hgu95av2.db_3.2.3           org.Hs.eg.db_3.6.0         
 [5] AnnotationDbi_1.42.1        ALL_1.22.0                  EnrichmentBrowser_2.10.9    graph_1.58.0               
 [9] SummarizedExperiment_1.10.1 DelayedArray_0.6.4          BiocParallel_1.14.2         matrixStats_0.54.0         
[13] Biobase_2.40.0              GenomicRanges_1.32.6        GenomeInfoDb_1.16.0         IRanges_2.14.10            
[17] S4Vectors_0.18.3            BiocGenerics_0.26.0        

loaded via a namespace (and not attached):
  [1] circlize_0.4.4           backports_1.1.2          GOstats_2.46.0           Hmisc_4.1-1             
  [5] fastmatch_1.1-0          BiocFileCache_1.4.0      plyr_1.8.4               igraph_1.2.2            
  [9] lazyeval_0.2.1           GSEABase_1.42.0          splines_3.5.1            ggplot2_3.0.0           
 [13] digest_0.6.15            BiocInstaller_1.30.0     ensembldb_2.4.1          htmltools_0.3.6         
 [17] GOSemSim_2.6.0           viridis_0.5.1            GO.db_3.6.0              magrittr_1.5            
 [21] checkmate_1.8.5          memoise_1.1.0            BSgenome_1.48.0          cluster_2.0.7-1         
 [25] limma_3.36.2             ComplexHeatmap_1.18.1    Biostrings_2.48.0        annotate_1.58.0         
 [29] R.utils_2.6.0            ggbio_1.28.4             prettyunits_1.0.2        enrichplot_1.0.2        
 [33] colorspace_1.3-2         blob_1.1.1               rappdirs_0.3.1           ggrepel_0.8.0           
 [37] dplyr_0.7.6              crayon_1.3.4             RCurl_1.95-4.11          genefilter_1.62.0       
 [41] bindr_0.1.1              VariantAnnotation_1.26.1 survival_2.42-6          glue_1.3.0              
 [45] gtable_0.2.0             zlibbioc_1.26.0          XVector_0.20.0           UpSetR_1.3.3            
 [49] GetoptLong_0.1.7         graphite_1.26.1          Rgraphviz_2.24.0         shape_1.4.4             
 [53] SparseM_1.77             scales_0.5.0             DOSE_3.6.1               DBI_1.0.0               
 [57] GGally_1.4.0             edgeR_3.22.3             Rcpp_0.12.18             progress_1.2.0          
 [61] viridisLite_0.3.0        xtable_1.8-2             htmlTable_1.12           units_0.6-0             
 [65] foreign_0.8-71           bit_1.1-14               reactome.db_1.64.0       OrganismDbi_1.22.0      
 [69] Formula_1.2-3            AnnotationForge_1.22.2   htmlwidgets_1.2          httr_1.3.1              
 [73] fgsea_1.6.0              RColorBrewer_1.1-2       acepack_1.4.1            R.methodsS3_1.7.1       
 [77] pkgconfig_2.0.1          reshape_0.8.7            XML_3.98-1.13            nnet_7.3-12             
 [81] dbplyr_1.2.2             locfit_1.5-9.1           tidyselect_0.2.4         rlang_0.2.1             
 [85] reshape2_1.4.3           munsell_0.5.0            tools_3.5.1              RSQLite_2.1.1           
 [89] ggridges_0.5.0           stringr_1.3.1            knitr_1.20               bit64_0.9-7             
 [93] purrr_0.2.5              AnnotationFilter_1.4.0   KEGGREST_1.20.1          ggraph_1.0.2            
 [97] ReactomePA_1.24.0        RBGL_1.56.0              R.oo_1.22.0              KEGGgraph_1.40.0        
[101] DO.db_2.9                biomaRt_2.36.1           compiler_3.5.1           rstudioapi_0.7          
[105] curl_3.2                 png_0.1-7                PFAM.db_3.6.0            tibble_1.4.2            
[109] tweenr_0.1.5             geneplotter_1.58.0       stringi_1.1.7            GenomicFeatures_1.32.0  
[113] lattice_0.20-35          ProtGenerics_1.12.0      Matrix_1.2-14            pillar_1.3.0            
[117] GlobalOptions_0.1.0      data.table_1.11.4        cowplot_0.9.3            bitops_1.0-6            
[121] rtracklayer_1.40.3       qvalue_2.12.0            hwriter_1.3.2            R6_2.2.2                
[125] latticeExtra_0.6-28      gridExtra_2.3            biocGraph_1.42.0         dichromat_2.0-0         
[129] MASS_7.3-50              assertthat_0.2.0         rjson_0.2.20             DESeq2_1.20.0           
[133] Category_2.46.0          ReportingTools_2.20.0    safe_3.20.0              GenomicAlignments_1.16.0
[137] Rsamtools_1.32.2         GenomeInfoDbData_1.1.0   hms_0.4.2                grid_3.5.1              
[141] rpart_4.1-13             biovizBase_1.28.1        ggforce_0.1.3            base64enc_0.1-3         
lgeistlinger commented 6 years ago

Agreed the error message can be more informative.

However, there is not much that eaBrowse can do as in this case there are no significant gene sets:

> kegg.res.fdr$nr.sigs
[1] 0
> gsRanking(kegg.res.fdr, signif.only=FALSE)
DataFrame with 319 rows and 4 columns
                                           GENE.SET GLOB.STAT NGLOB.STAT
                                        <character> <numeric>  <numeric>
1    hsa04622_RIG-I-like_receptor_signaling_pathway         4     0.0741
2                        hsa05416_Viral_myocarditis         4     0.0727
3        hsa00053_Ascorbate_and_aldarate_metabolism         1     0.0714
4                      hsa00910_Nitrogen_metabolism         1     0.0714
5    hsa05130_Pathogenic_Escherichia_coli_infection         3     0.0698
...                                             ...       ...        ...
315                             hsa05034_Alcoholism         0          0
316             hsa04915_Estrogen_signaling_pathway         0          0
317 hsa04261_Adrenergic_signaling_in_cardiomyocytes         0          0
318                      hsa00230_Purine_metabolism         0          0
319             hsa04022_cGMP-PKG_signaling_pathway         0          0
              P.VALUE
            <numeric>
1   0.817762214983713
2   0.817762214983713
3   0.817762214983713
4   0.817762214983713
5   0.817762214983713
...               ...
315 0.868164556962025
316 0.868164556962025
317 0.872470031545741
318 0.878754716981132
319             0.913

This is exactly the use case for the nr.show argument, i.e. the choice of how many gene sets to display in such a case is left to the user.

lgeistlinger commented 6 years ago

A more informative error message has been introduced for this case in v2.10.10