Closed jinyancool closed 1 year ago
packageVersion("furrr") [1] '0.3.0.9000'
packageVersion("future") [1] '1.25.0'
sessionInfo() R version 4.1.1 (2021-08-10) Platform: x86_64-conda-linux-gnu (64-bit) Running under: CentOS Linux 8
Matrix products: default BLAS/LAPACK: /cluster/apps/anaconda3/2020.02/envs/R-4.1.1/lib/libopenblasp-r0.3.17.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] furrr_0.3.0.9000 future_1.25.0 jhtools_1.0.0
[4] glue_1.6.2 jhuanglabwgs_1.0.0 optparse_1.7.1
[7] configr_0.3.5 futile.logger_1.4.3 pak_0.3.0
[10] devtools_2.4.3 usethis_2.1.5 rvcheck_0.2.1
[13] forcats_0.5.1 stringr_1.4.0 dplyr_1.0.9
[16] purrr_0.3.4 readr_2.1.2 tidyr_1.2.0
[19] tibble_3.1.7 ggplot2_3.3.6 tidyverse_1.3.1
[22] fs_1.5.2 wget_0.0.1
loaded via a namespace (and not attached):
[1] utf8_1.2.2 tidyselect_1.1.2
[3] htmlwidgets_1.5.4 RSQLite_2.2.14
[5] AnnotationDbi_1.54.1 grid_4.1.1
[7] BiocParallel_1.28.3 munsell_0.5.0
[9] codetools_0.2-18 withr_2.5.0
[11] colorspace_2.0-3 Biobase_2.54.0
[13] filelock_1.0.2 ggfortify_0.4.14
[15] knitr_1.39 rstudioapi_0.13
[17] stats4_4.1.1 ggsignif_0.6.3
[19] listenv_0.8.0 MatrixGenerics_1.6.0
[21] tximport_1.20.0 GenomeInfoDbData_1.2.7
[23] ini_0.3.1 bit64_4.0.5
[25] rprojroot_2.0.3 parallelly_1.31.1
[27] vctrs_0.4.1 generics_0.1.2
[29] xfun_0.30 lambda.r_1.2.4
[31] biovizBase_1.40.0 BiocFileCache_2.2.1
[33] regioneR_1.24.0 R6_2.5.1
[35] GenomeInfoDb_1.30.1 AnnotationFilter_1.16.0
[37] bitops_1.0-7 cachem_1.0.6
[39] DelayedArray_0.20.0 assertthat_0.2.1
[41] BiocIO_1.2.0 scales_1.2.0
[43] nnet_7.3-17 gtable_0.3.0
[45] globals_0.15.0 processx_3.5.3
[47] ensembldb_2.16.4 rlang_1.0.2
[49] splines_4.1.1 lazyeval_0.2.2
[51] rtracklayer_1.52.1 rstatix_0.7.0
[53] dichromat_2.0-0.1 checkmate_2.1.0
[55] broom_0.8.0 BiocManager_1.30.17
[57] yaml_2.3.5 abind_1.4-5
[59] modelr_0.1.8 GenomicFeatures_1.44.2
[61] backports_1.4.1 Hmisc_4.7-0
[63] tools_4.1.1 ellipsis_0.3.2
[65] gplots_3.1.3 RColorBrewer_1.1-3
[67] karyoploteR_1.18.0 DNAcopy_1.66.0
[69] BiocGenerics_0.40.0 sessioninfo_1.2.2
[71] Rcpp_1.0.8.3 base64enc_0.1-3
[73] progress_1.2.2 zlibbioc_1.40.0
[75] RCurl_1.98-1.6 ps_1.7.0
[77] prettyunits_1.1.1 rpart_4.1.16
[79] ggpubr_0.4.0 RcppTOML_0.1.7
[81] S4Vectors_0.32.4 cluster_2.1.3
[83] SummarizedExperiment_1.24.0 haven_2.5.0
[85] magrittr_2.0.3 data.table_1.14.2
[87] futile.options_1.0.1 openxlsx_4.2.5
[89] reprex_2.0.1 ProtGenerics_1.24.0
[91] matrixStats_0.62.0 pkgload_1.2.4
[93] hms_1.1.1 patchwork_1.1.1
[95] XML_3.99-0.9 jpeg_0.1-9
[97] readxl_1.4.0 IRanges_2.28.0
[99] gridExtra_2.3 testthat_3.1.4
[101] compiler_4.1.1 biomaRt_2.48.3
[103] KernSmooth_2.23-20 crayon_1.5.1
[105] htmltools_0.5.2 tzdb_0.3.0
[107] Formula_1.2-4 lubridate_1.8.0
[109] DBI_1.1.2 formatR_1.12
[111] corrplot_0.92 dbplyr_2.1.1
[113] rappdirs_0.3.3 Matrix_1.4-1
[115] getopt_1.20.3 car_3.0-13
[117] brio_1.1.3 cli_3.3.0
[119] gdata_2.18.0 parallel_4.1.1
[121] GenomicRanges_1.46.1 pkgconfig_2.0.3
[123] GenomicAlignments_1.28.0 foreign_0.8-82
[125] xml2_1.3.3 XVector_0.34.0
[127] rvest_1.0.2 yulab.utils_0.0.4
[129] bezier_1.1.2 VariantAnnotation_1.38.0
[131] callr_3.7.0 digest_0.6.29
[133] Biostrings_2.60.2 cellranger_1.1.0
[135] htmlTable_2.4.0 restfulr_0.0.13
[137] curl_4.3.2 Rsamtools_2.8.0
[139] gtools_3.9.2 rjson_0.2.21
[141] lifecycle_1.0.1 jsonlite_1.8.0
[143] carData_3.0-5 desc_1.4.1
[145] limma_3.50.3 BSgenome_1.60.0
[147] fansi_1.0.3 pillar_1.7.0
[149] lattice_0.20-45 survival_3.3-1
[151] KEGGREST_1.32.0 fastmap_1.1.0
[153] httr_1.4.3 pkgbuild_1.3.1
[155] remotes_2.4.2 conflicted_1.1.0
[157] zip_2.2.0 bamsignals_1.24.0
[159] png_0.1-7 bit_4.0.4
[161] stringi_1.7.6 blob_1.2.3
[163] org.Hs.eg.db_3.13.0 latticeExtra_0.6-29
[165] caTools_1.18.2 memoise_2.0.1
https://cran.r-project.org/web/packages/future/vignettes/future-7-for-package-developers.html
The document at the above URL does not help.
It does not work as expected
You haven't explained what the actual problem is. Can you please provide some output for the failing case?
It does not fail. Just does not work as expected. Calling with R scirpt function, it can use 60 workers in parallel. Calling with R package function mypkg::mutect2(config, interval_dir), it only uses two workers in sequential. I can repeat this problem stably. I have tried .env_globals = rlang::global_env() or .env_globals = parent.frame(). It does not help.
future_walk(intervals, ~ mutect2_wes_one(config, .x), .env_globals = rlang::global_env())
I just think it is the function calling environment that caused this problem.
So the problem is that it is running sequentially when called through the package, even though you set plan(multisession)
in the package function? But if you don't put it in a package then it correctly runs in parallel?
That sounds strange to me.
It is unlikely to be a function environment issue if that is the case.
Can you point me to a repo on GitHub that has this package in it? Or can you create a repo on GitHub that demonstrates this problem for you? I am unlikely to be able to help you otherwise
By the way, setting plan()
inside a function is typically not best practice. plan()
should really only be called at the user level. Users should control whether or not the function runs in parallel, and the default should be to run sequentially.
It is right. The problem is that it is running sequentially when called through the package, even though I set plan(multisession) in the package function. But if I don't put it in a package then it correctly runs in parallel. I will try to upload the package to github. It is better that you have gatk installed.
If I put the plan() outside the R package, it still cannot run in parallel.
I have made an R package at:
https://github.com/jinyancool/fakepkg
The function is:
test_furrr <- function(){ intervals <- seq(1,60) oplan <- plan(multisession, workers = 60) on.exit(plan(oplan), add = TRUE) future_walk(intervals, ~ run_fun(.x)) }
You will find run: fakepkg::test_furrr() and paste test_furrr() script in terminal, then run directly are quite different.
library(tictoc) tic() test_furrr() toc() 25.259 sec elapsed
tic() fakepkg::test_furrr() toc() 157.118 sec elapsed
Can we solve this issue now? Thanks.
Closing due to inability to reproduce
Dear developer, When I wrote an R scirpt function with future_walk, it can work in parallel, but if I wrap this R function in R package, it works in sequential.
run: mutect2(config, interval_dir) is fine.
################## R package mypkg function, and call this function outside R package, e.g, mypkg::mutect2. It does not work as expected.
run: mypkg::mutect2(config, interval_dir) does not work as expected.