varemo / piano

piano - An R/Bioconductor package for gene set analysis
https://varemo.github.io/piano/
13 stars 4 forks source link

Error in pValues[tmp] <- -pValues[tmp]: NAs are not allowed in subscripted assignments #6

Open janstrauss1 opened 4 years ago

janstrauss1 commented 4 years ago

Dear @varemo,

I currently run into the following error when trying to plot networkPlot or networkPlot2 with class = "distinct" and direction = "both" for a larger set of 5712 genes?

Error in pValues[tmp] <- -pValues[tmp] : 
  NAs are not allowed in subscripted assignments
In addition: Warning messages:
1: In FUN(newX[, i], ...) : no non-missing arguments to min; returning Inf
2: In FUN(newX[, i], ...) : no non-missing arguments to min; returning Inf

Yet, it works fine when I plot networkPlot or networkPlot2 using class = "distinct" but specifying a single direction (direction = "up" or direction = "down").

Strangely, I don't encounter such issues when I run the same code but with a smaller (similar) data set of 968 genes?!

I'm a bit puzzled. Any ideas where the error might be?

Many thanks in advance,

Jan

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] limma_3.40.6        visNetwork_2.0.9    plier_1.54.0        affy_1.62.0         Biobase_2.44.0     
[6] BiocGenerics_0.30.0 piano_2.0.2        

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.4            lattice_0.20-38       relations_0.6-9       gtools_3.8.1          assertthat_0.2.1     
 [6] digest_0.6.25         mime_0.9              slam_0.1-47           R6_2.4.1              ggplot2_3.3.0        
[11] pillar_1.4.3          zlibbioc_1.30.0       gplots_3.0.3          rlang_0.4.5           rstudioapi_0.10      
[16] data.table_1.12.8     gdata_2.18.0          Matrix_1.2-17         DT_0.13               preprocessCore_1.46.0
[21] sets_1.0-18           shinyjs_1.1           BiocParallel_1.18.1   htmlwidgets_1.5.1     igraph_1.2.5         
[26] munsell_0.5.0         shiny_1.4.0.2         fgsea_1.10.1          compiler_3.6.1        httpuv_1.5.2         
[31] pkgconfig_2.0.3       marray_1.62.0         htmltools_0.4.0       tidyselect_0.2.5      tibble_3.0.0         
[36] gridExtra_2.3         fansi_0.4.1           crayon_1.3.4          dplyr_0.8.3           later_1.0.0          
[41] bitops_1.0-6          grid_3.6.1            jsonlite_1.6.1        xtable_1.8-4          gtable_0.3.0         
[46] lifecycle_0.2.0       magrittr_1.5          scales_1.1.0          KernSmooth_2.23-15    cli_2.0.2            
[51] affyio_1.54.0         promises_1.1.0        ellipsis_0.3.0        vctrs_0.2.4           fastmatch_1.1-0      
[56] tools_3.6.1           glue_1.3.2            purrr_0.3.3           yaml_2.2.0            fastmap_1.0.1        
[61] colorspace_1.4-1      BiocManager_1.30.10   cluster_2.1.0         caTools_1.18.0        shinydashboard_0.7.1 
varemo commented 4 years ago

Interesting, have not seen this before! I suspect it has to do with some p-values being NA. Could you check your GSAres object and see what $pDistinctDirUp and $pDistinctDirDn are (or $pAdjDistinctDirUp/Dn if using adjusted p-values)? Try setting the ones that are NA to 1. Will this remove the error?

janstrauss1 commented 4 years ago

Thanks for your quick response!

I have also already suspected some p-values being NA and checked my input data but there didn't seem to be any NAs in my input data tables.

Regarding my GSAres object, I exported the summary table generated with GSASummaryTable (my_GSAsummaryTable.xlsx). It seems that there are some NAs including $pDistinctDirUp and $pDistinctDirDn for two GO terms.

It also shows up using the following code:

> sum(is.na(gsaRes$pDistinctDirUp))
[1] 2
> sum(is.na(gsaRes$pDistinctDirDn))
[1] 2
> sum(is.na(gsaRes$pAdjDistinctDirUp))
[1] 2
> sum(is.na(gsaRes$pAdjDistinctDirDn))
[1] 2

So I then try to set the NAs to 1 using:

is.na(gsaRes$pDistinctDirUp) <- 1
is.na(gsaRes$pAdjDistinctDirUp) <- 1
is.na(gsaRes$pDistinctDirDn) <- 1
is.na(gsaRes$pAdjDistinctDirDn) <- 1

Unfortunately, I still receive the same error when running:

nw <- networkPlot(gsaRes,
                  class = "distinct",
                  direction = "both",
                  significance=0.01,
                  overlap = 1,
                  lay = 4, # numerical between 1-5 to set the default layout
                  label = "names",
                  ncharLabel=Inf,
                  cexLabel=1.1
)

Strangely, when checking my GSASummaryTable again (after having tried to set NAs to 1 using above code) some additional NA are introduced for the GO term in the first row?!

> sum(is.na(gsaRes$pDistinctDirUp))
[1] 3
> sum(is.na(gsaRes$pDistinctDirDn))
[1] 3
> sum(is.na(gsaRes$pAdjDistinctDirUp))
[1] 3
> sum(is.na(gsaRes$pAdjDistinctDirDn))
[1] 3

I assume that is.na is not properly working on the GSAres object? Any ideas how to get rid of the NAs?

Many thanks for your help! Jan

janstrauss1 commented 4 years ago

Just to add to my previous comment. It really seems that is.na(gsaRes$p...) <- 1 introduces an NA at the very first row instead of replacing NA with 1.

janstrauss1 commented 4 years ago

I seem to have found a (at least temporary) solution to solve the error.

Replacing NAs in my GSAres object by the following code seems to fix the error when using the networkPlot function with class = "distinct" and direction = "both"

> gsaRes$pDistinctDirUp[is.na(gsaRes$pDistinctDirUp)] <- 1
> sum(is.na(gsaRes$pDistinctDirUp))
[1] 0

Interestingly, networkPlot seems to work fine with only replacing NAs in my gsaRes$pDistinctDirUp by above code without replacing NAs in gsaRes$pDistinctDirDn! In other words, calling the networkPlot function with class = "distinct" and direction = "both" after the following code works fine:

> gsaRes$pDistinctDirUp[is.na(gsaRes$pDistinctDirUp)] <- 1
> sum(is.na(gsaRes$pDistinctDirUp))
[1] 0
> sum(is.na(gsaRes$pDistinctDirDn))
[1] 2

However, interesting is that the error still seems to remain when calling the newer networkPlot2 function instead of networkPlot (even when I replace NA in both, gsaRes$pDistinctDirUp and gsaRes$pDistinctDirDn with 1). Strangely, for the networkPlot2 function with class = "distinct" and direction = "both", the error seems to disappear only when replacing NA with 1 in both, p-values and adjusted p-values, using following code:

> gsaRes$pDistinctDirUp[is.na(gsaRes$pDistinctDirUp)] <- 1
> sum(is.na(gsaRes$pDistinctDirUp))
[1] 0
> gsaRes$pDistinctDirDn[is.na(gsaRes$pDistinctDirDn)] <- 1
> sum(is.na(gsaRes$pDistinctDirDn))
[1] 0
> gsaRes$pAdjDistinctDirUp[is.na(gsaRes$pAdjDistinctDirUp)] <- 1
> sum(is.na(gsaRes$pAdjDistinctDirUp))
[1] 0
> gsaRes$pAdjDistinctDirDn[is.na(gsaRes$pAdjDistinctDirDn)] <- 1
> sum(is.na(gsaRes$pAdjDistinctDirDn))
[1] 0

Any ideas about this behaviour? Many thanks in advance, Jan