ococrook / bandle

An R package for Bayesian analysis of differential localisation experiments
https://ococrook.github.io/bandle/
5 stars 1 forks source link

CPU issues? #18

Open JosieAC opened 2 years ago

JosieAC commented 2 years ago

https://github.com/ococrook/bandle/blob/0fa56ad941d0934194ff6246288c2719de3d45a7/R/bandle-function.R#L91

Hi Olly,

Not sure what is going on but I have tried to explain the issue as best as I could. Possibly CPU issues - possibly with assigning the correct number of cores when using default BPPARAM. On my previous laptop the bandle function would complete a run in ~40mins and appeared to run the default fine. When running bandle with the same parameters on my new laptop, with essentially the same/better specs, the code takes significantly longer (hours) and/or crashes. This seems to be resolved with the nu,ber of cores to use is explicitly specified in the function, as follows:

bandleres_12hr <- bandle(objectCond1 = control,
                    objectCond2 = treatment,
                    numIter = 10000, # usually 10,000
                    burnin = 5000, # usually 5,000
                    thin = 20, # usually 20
                    gpParams = gpParams,
                    pcPrior = pc_prior,
                    numChains = 4, # usually >=4
                    dirPrior = dirPrior,
                    BPPARAM = MulticoreParam(6L))

Without specifying the BBPARAM parameter, no progress bar shows and the CPU is working at 100% capacity. It does however appear to be using all cores - but these are all "maxed out", which causes R to crash or freeze. If I preserved potentially it may run okay but with my current laptop and a friend's laptop I borrowed this usually ended in the session crashing/freezing. When specifying as above the progress bar appears and the CPU works only at 50% capacity, uses all cores but these aren't all "maxed out".

The sessionInfo and bpparam for my old laptop is:

> bpparam()
class: SnowParam
  bpisup: FALSE; bpnworkers: 6; bptasks: 0; bpjobname: BPJOB
  bplog: FALSE; bpthreshold: INFO; bpstopOnError: TRUE
  bpRNGseed: ; bptimeout: 2592000; bpprogressbar: FALSE
  bpexportglobals: TRUE
  bplogdir: NA
  bpresultdir: NA
  cluster type: SOCK

> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19041)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252    LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C                            LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base    

other attached packages:
 [1] bandle_1.0           pRoloc_1.32.0        BiocParallel_1.26.0  MLInterfaces_1.72.0  cluster_2.1.2        annotate_1.70.0      XML_3.99-0.6        
 [8] AnnotationDbi_1.54.0 IRanges_2.26.0       MSnbase_2.18.0       ProtGenerics_1.24.0  mzR_2.26.0           Rcpp_1.0.6           Biobase_2.52.0      
[15] S4Vectors_0.30.0     BiocGenerics_0.38.0

loaded via a namespace (and not attached):
  [1] BiocFileCache_2.1.1    plyr_1.8.6             splines_4.1.0          GenomeInfoDb_1.29.3    ggplot2_3.3.3          digest_0.6.27        
  [7] foreach_1.5.1          htmltools_0.5.1.1      viridis_0.6.1          fansi_0.5.0            magrittr_2.0.1         memoise_2.0.0        
 [13] doParallel_1.0.16      mixtools_1.2.0         limma_3.48.0           recipes_0.1.16         Biostrings_2.60.0      gower_0.2.2          
 [19] lpSolve_5.6.15         prettyunits_1.1.1      colorspace_2.0-1       blob_1.2.2             rappdirs_0.3.3         xfun_0.23            
 [25] dplyr_1.0.6            crayon_1.4.1           RCurl_1.98-1.3         hexbin_1.28.2          impute_1.66.0          survival_3.2-11      
 [31] iterators_1.0.13       glue_1.4.2             gtable_0.3.0           ipred_0.9-11           zlibbioc_1.38.0        XVector_0.32.0        
 [37] kernlab_0.9-29         scales_1.1.1           vsn_3.60.0             mvtnorm_1.1-1          DBI_1.1.1              viridisLite_0.4.0    
 [43] xtable_1.8-4           progress_1.2.2         clue_0.3-59            bit_4.0.4              proxy_0.4-25           mclust_5.4.7          
 [49] preprocessCore_1.54.0  lbfgs_1.2.1            MsCoreUtils_1.4.0      lava_1.6.9             prodlim_2019.11.13     sampling_2.9          
 [55] httr_1.4.2             FNN_1.1.3              RColorBrewer_1.1-2     ellipsis_0.3.2         pkgconfig_2.0.3        nnet_7.3-16          
 [61] dbplyr_2.1.1           utf8_1.2.1             caret_6.0-88           tidyselect_1.1.1       rlang_0.4.11           reshape2_1.4.4        
 [67] munsell_0.5.0          tools_4.1.0            LaplacesDemon_16.1.6   cachem_1.0.5           generics_0.1.0         RSQLite_2.2.7        
 [73] evaluate_0.14          stringr_1.4.0          fastmap_1.1.0          mzID_1.31.0            yaml_2.2.1             ModelMetrics_1.2.2.2  
 [79] knitr_1.33             bit64_4.0.5            purrr_0.3.4            randomForest_4.6-14    KEGGREST_1.33.0        dendextend_1.15.1    
 [85] ncdf4_1.17             nlme_3.1-152           xml2_1.3.2             biomaRt_2.49.2         compiler_4.1.0         rstudioapi_0.13      
 [91] filelock_1.0.2         curl_4.3.1             png_0.1-7              e1071_1.7-8            affyio_1.62.0          tibble_3.1.2          
 [97] stringi_1.6.2          lattice_0.20-44        Matrix_1.3-3           vctrs_0.3.8            pillar_1.6.2           lifecycle_1.0.0      
[103] BiocManager_1.30.16    MALDIquant_1.19.3      data.table_1.14.0      bitops_1.0-7           R6_2.5.0               pcaMethods_1.84.0    
[109] affy_1.70.0            gridExtra_2.3          codetools_0.2-18       MASS_7.3-54            gtools_3.8.2           assertthat_0.2.1      
[115] rprojroot_2.0.2        withr_2.4.2            GenomeInfoDbData_1.2.6 hms_1.1.0              grid_4.1.0             rpart_4.1-15          
[121] timeDate_3043.102      coda_0.19-4            class_7.3-19           rmarkdown_2.9          segmented_1.3-4        pROC_1.17.0.1        
[127] lubridate_1.7.10  

For my new laptop:


> bpparam()
class: SnowParam
  bpisup: FALSE; bpnworkers: 6; bptasks: 0; bpjobname: BPJOB
  bplog: FALSE; bpthreshold: INFO; bpstopOnError: TRUE
  bpRNGseed: ; bptimeout: 2592000; bpprogressbar: FALSE
  bpexportglobals: TRUE; bpforceGC: FALSE
  bplogdir: NA
  bpresultdir: NA
  cluster type: SOCK
> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252    LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] bandle_1.0           pRoloc_1.34.0        BiocParallel_1.28.1  MLInterfaces_1.74.0  cluster_2.1.2        annotate_1.72.0      XML_3.99-0.8        
 [8] AnnotationDbi_1.56.2 IRanges_2.28.0       MSnbase_2.20.1       ProtGenerics_1.26.0  mzR_2.28.0           Rcpp_1.0.7           Biobase_2.54.0      
[15] S4Vectors_0.32.3     BiocGenerics_0.40.0 

loaded via a namespace (and not attached):
  [1] snow_0.4-4             circlize_0.4.13        BiocFileCache_2.2.0    plyr_1.8.6             splines_4.1.2          listenv_0.8.0          GenomeInfoDb_1.30.0   
  [8] ggplot2_3.3.5          digest_0.6.28          foreach_1.5.1          htmltools_0.5.2        viridis_0.6.2          ggalluvial_0.12.3      fansi_0.5.0           
 [15] magrittr_2.0.1         memoise_2.0.0          doParallel_1.0.16      mixtools_1.2.0         limma_3.50.0           recipes_0.1.17         globals_0.14.0        
 [22] Biostrings_2.62.0      gower_0.2.2            lpSolve_5.6.15         prettyunits_1.1.1      colorspace_2.0-2       ggrepel_0.9.1          blob_1.2.2            
 [29] rappdirs_0.3.3         xfun_0.28              dplyr_1.0.7            crayon_1.4.2           RCurl_1.98-1.5         hexbin_1.28.2          impute_1.68.0         
 [36] survival_3.2-13        iterators_1.0.13       glue_1.5.0             gtable_0.3.0           ipred_0.9-12           zlibbioc_1.40.0        XVector_0.34.0        
 [43] kernlab_0.9-29         shape_1.4.6            future.apply_1.8.1     scales_1.1.1           vsn_3.62.0             mvtnorm_1.1-3          DBI_1.1.1             
 [50] viridisLite_0.4.0      xtable_1.8-4           progress_1.2.2         clue_0.3-60            bit_4.0.4              proxy_0.4-26           mclust_5.4.8          
 [57] preprocessCore_1.56.0  lbfgs_1.2.1            MsCoreUtils_1.6.0      lava_1.6.10            prodlim_2019.11.13     sampling_2.9           httr_1.4.2            
 [64] FNN_1.1.3              RColorBrewer_1.1-2     ellipsis_0.3.2         pkgconfig_2.0.3        nnet_7.3-16            dbplyr_2.1.1           utf8_1.2.2            
 [71] caret_6.0-90           tidyselect_1.1.1       rlang_0.4.12           reshape2_1.4.4         munsell_0.5.0          tools_4.1.2            LaplacesDemon_16.1.6  
 [78] cachem_1.0.6           generics_0.1.1         RSQLite_2.2.8          evaluate_0.14          stringr_1.4.0          fastmap_1.1.0          mzID_1.32.0           
 [85] yaml_2.2.1             ModelMetrics_1.2.2.2   knitr_1.36             bit64_4.0.5            randomForest_4.6-14    purrr_0.3.4            KEGGREST_1.34.0       
 [92] dendextend_1.15.2      ncdf4_1.17.1           future_1.23.0          nlme_3.1-153           xml2_1.3.2             biomaRt_2.50.1         compiler_4.1.2        
 [99] rstudioapi_0.13        filelock_1.0.2         curl_4.3.2             png_0.1-7              e1071_1.7-9            affyio_1.64.0          tibble_3.1.6          
[106] stringi_1.7.5          lattice_0.20-45        Matrix_1.3-4           vctrs_0.3.8            pillar_1.6.4           lifecycle_1.0.1        BiocManager_1.30.16   
[113] GlobalOptions_0.1.2    MALDIquant_1.20        data.table_1.14.2      bitops_1.0-7           R6_2.5.1               pcaMethods_1.86.0      affy_1.72.0           
[120] gridExtra_2.3          parallelly_1.29.0      codetools_0.2-18       gtools_3.9.2           MASS_7.3-54            assertthat_0.2.1       rprojroot_2.0.2       
[127] withr_2.4.2            GenomeInfoDbData_1.2.7 parallel_4.1.2         hms_1.1.1              grid_4.1.2             rpart_4.1-15           timeDate_3043.102     
[134] tidyr_1.1.4            coda_0.19-4            class_7.3-19           rmarkdown_2.11         segmented_1.3-4        pROC_1.18.0            lubridate_1.8.0

Let me know if you have any questions or need more info.

ococrook commented 2 years ago

Just clarifying it's fixed if you specify backend explicitly?

JosieAC commented 2 years ago

Technically yes, but it is still far slower than my previous laptop.

ococrook commented 2 years ago

Can you replace MulticoreParam(6L) with SnowParam(workers = 4L)?

JosieAC commented 2 years ago

This causes the same outcome as the default. No progress bar and "maxed out" CPU.

ococrook commented 2 years ago

Having a dig around this seems to a be a biocparrallel issue rather than a bandle one. @lmsimp I'll do some digging but if we can some check locally. I'll keep the issue open for now.

One thing to check is you have the most upto date version of RStudio and Rtools