Open fabian-s opened 5 years ago
Thanks for reporting. Yes, it's likely that there's a race condition for updating the internal Rprofmem file. This file is unique per R session;
but it is shared across forked R processes as when using mclapply()
or plan(multicore)
.
It shouldn't be too hard to make it unique to each forked child process. I'll add it to the todo list for the next release.
Forgot to say, fixing this in the 'profmem' package will not fix the closely-related problem in bench::mark()
. The 'bench' package uses 'profmem' only for parsing an Rprofile file (profmem::readRprofmem()
) - it does not rely on the other parts of profmem.
Follow up/for the record: This problem can be reproduced as:
> p <- profmem::profmem(y <- parallel::mclapply(1:10, rnorm))
Error in parse(text = trace) : <text>:1:103: unexpected symbol
1: c("FUN", "lapply", "doTryCatch", "tryCatchOne", "tryCatchList", "tryCatch", "try", "sendMaster", "FUN"chList
^
> traceback()
8: parse(text = trace)
7: eval(parse(text = trace), enclos = baseenv())
6: FUN(X[[i]], ...)
5: lapply(bfr, FUN = function(x) {
bytes <- gsub(pattern, "\\1", x)
what <- rep("alloc", times = length(x))
idxs <- which(bytes == "new page")
if (length(idxs) > 0) {
what[idxs] <- "new page"
bytes[idxs] <- ""
}
bytes <- as.numeric(bytes)
trace <- gsub(pattern, "\\2", x)
trace <- gsub("\" \"", "\", \"", trace, fixed = TRUE)
trace <- sprintf("c(%s)", trace)
trace <- eval(parse(text = trace), enclos = baseenv())
trace <- trace[seq_len(max(0L, length(trace) - drop))]
list(what = what, bytes = bytes, trace = trace)
})
4: readRprofmem(pathname, drop = drop)
3: profmem_suspend()
2: profmem_begin(threshold = threshold)
1: profmem::profmem(y <- parallel::mclapply(1:10, rnorm))
Not surprisingly, it can also be reproduced with plain utils::Rprofmem()
:
> pf <- tempfile()
> local({ utils::Rprofmem(pf); on.exit(utils::Rprofmem("")); y <- parallel::mclapply(1:10, FUN = function(ii) integer(1e6)) })
> p <- profmem::readRprofmem(pf)
Error in parse(text = trace) : <text>:1:18: unexpected symbol
1: c("FUN", "l272 :"get0
^
> traceback()
5: parse(text = trace)
4: eval(parse(text = trace), enclos = baseenv())
3: FUN(X[[i]], ...)
2: lapply(bfr, FUN = function(x) {
bytes <- gsub(pattern, "\\1", x)
what <- rep("alloc", times = length(x))
idxs <- which(bytes == "new page")
if (length(idxs) > 0) {
what[idxs] <- "new page"
bytes[idxs] <- ""
}
bytes <- as.numeric(bytes)
trace <- gsub(pattern, "\\2", x)
trace <- gsub("\" \"", "\", \"", trace, fixed = TRUE)
trace <- sprintf("c(%s)", trace)
trace <- eval(parse(text = trace), enclos = baseenv())
trace <- trace[seq_len(max(0L, length(trace) - drop))]
list(what = what, bytes = bytes, trace = trace)
})
1: profmem::readRprofmem(pf)
As OP says, this is because forked parallel processing is used. This is because R's internal do_Rprofmem()
and more precisely R_InitMemReporting()
/R_EndMemReporting()
are not fork safe; all forked child processes will write to the exact same Rprofmem file as the parent process.
Actions: I'm not sure there is much that can be done in the profmem package. I don't think it's possible to protect against using Rprofmem()
in forked processing. Maybe one could give a more informative error message, e.g. "Error: ... Was memory profiling run with forked processing? If so, that is unfortunately not safe".
Here's an example of intertwined Rprofmem output from multiple forked processes:
192 :"FUN" "lapply" "<Anonymous>" "<Anonymous>"
328 :"FUN" "lapply" "<Anonymous>" "<Anonymous>"
192 :"FUN" "lapply" "<Anonynew page:"tolower" ".check_ncores" "<Anonymous>" "<Anonymous>"
new page:"seq.default" "seq" "FUN" "lapply" "<Anonymous>" "<Anonymous>"
new page:"seq.default" "seq" "FUN" "lapply" "<Anonymous>" "<Anonymous>"
Note that line with '... "<Anonynew page:"tolower" ...'. This gives:
Error in parse(text = trace) : <text>:1:37: unexpected symbol
1: c("FUN", "lapply", "<Anonynew page:"tolower
One option could be the replace non-parsable entries with a <non-parsable>
result and produce a warning that the captured and parsed Rprofmem results might not be correct. One could also give more control to the user on what should happen via an onParseError=c("error", "warning", "ignore")
argument to readRprofmem()
.
this is stochastic, the failing examples below sometimes work for smaller jobs with fewer calls.
fork
-ed computations, mainly, but not consistently (cf.future.lapply
, which never failed me so far)Enter a frame number, or 0 to exit
1: bench::mark(parallel::mclapply(seq_len(1000), keep_busy)) 2: eval_one(exprs[[i]]) 3: parse_allocations(f) 4: profmem::readRprofmem(filename) 5: lapply(bfr, FUN = function(x) { bytes <- gsub(pattern, "\1", x) wh 6: FUN(X[[i]], ...) 7: eval(parse(text = trace)) 8: parse(text = trace)
Created on 2019-10-06 by the reprex package (v0.3.0)
Session info
``` r devtools::session_info() #> ─ Session info ────────────────────────────────────────────────────────── #> setting value #> version R version 3.6.1 (2019-07-05) #> os Linux Mint 19.2 #> system x86_64, linux-gnu #> ui X11 #> language en_US #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz Europe/Berlin #> date 2019-10-06 #> #> ─ Packages ────────────────────────────────────────────────────────────── #> package * version date lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.1) #> backports 1.1.4 2019-04-10 [1] CRAN (R 3.6.1) #> bench 1.0.4 2019-09-06 [1] CRAN (R 3.6.1) #> callr 3.3.1 2019-07-18 [1] CRAN (R 3.6.1) #> cli 1.1.0 2019-03-19 [1] CRAN (R 3.6.1) #> codetools 0.2-16 2018-12-24 [4] CRAN (R 3.5.2) #> crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.1) #> desc 1.2.0 2018-05-01 [1] CRAN (R 3.6.1) #> devtools 2.1.0 2019-07-06 [1] CRAN (R 3.6.1) #> digest 0.6.20 2019-07-04 [1] CRAN (R 3.6.1) #> doParallel * 1.0.15 2019-08-02 [1] CRAN (R 3.6.1) #> doSNOW * 1.0.18 2019-07-27 [1] CRAN (R 3.6.1) #> evaluate 0.14 2019-05-28 [1] CRAN (R 3.6.1) #> fansi 0.4.0 2018-10-05 [1] CRAN (R 3.6.1) #> foreach * 1.4.7 2019-07-27 [1] CRAN (R 3.6.1) #> fs 1.3.1 2019-05-06 [1] CRAN (R 3.6.1) #> future 1.14.0 2019-07-02 [1] CRAN (R 3.6.1) #> future.apply 1.3.0 2019-06-18 [1] CRAN (R 3.6.1) #> globals 0.12.4 2018-10-11 [1] CRAN (R 3.6.1) #> glue 1.3.1 2019-03-12 [1] CRAN (R 3.6.1) #> highr 0.8 2019-03-20 [1] CRAN (R 3.6.1) #> htmltools 0.3.6 2017-04-28 [1] CRAN (R 3.6.1) #> iterators * 1.0.12 2019-07-26 [1] CRAN (R 3.6.1) #> knitr 1.24 2019-08-08 [1] CRAN (R 3.6.1) #> listenv 0.7.0 2018-01-21 [1] CRAN (R 3.6.1) #> magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.1) #> memoise 1.1.0 2017-04-21 [1] CRAN (R 3.6.1) #> pillar 1.4.2 2019-06-29 [1] CRAN (R 3.6.1) #> pkgbuild 1.0.4 2019-08-05 [1] CRAN (R 3.6.1) #> pkgconfig 2.0.2 2018-08-16 [1] CRAN (R 3.6.1) #> pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.6.1) #> prettyunits 1.0.2 2015-07-13 [1] CRAN (R 3.6.1) #> processx 3.4.1 2019-07-18 [1] CRAN (R 3.6.1) #> profmem 0.5.0 2018-01-30 [1] CRAN (R 3.6.1) #> ps 1.3.0 2018-12-21 [1] CRAN (R 3.6.1) #> R6 2.4.0 2019-02-14 [1] CRAN (R 3.6.1) #> Rcpp 1.0.2 2019-07-25 [1] CRAN (R 3.6.1) #> remotes 2.1.0 2019-06-24 [1] CRAN (R 3.6.1) #> rlang 0.4.0 2019-06-25 [1] CRAN (R 3.6.1) #> rmarkdown 1.15 2019-08-21 [1] CRAN (R 3.6.1) #> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.6.1) #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.1) #> snow * 0.4-3 2018-09-14 [1] CRAN (R 3.6.1) #> stringi 1.4.3 2019-03-12 [1] CRAN (R 3.6.1) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 3.6.1) #> testthat 2.2.1 2019-07-25 [1] CRAN (R 3.6.1) #> tibble 2.1.3 2019-06-06 [1] CRAN (R 3.6.1) #> usethis 1.5.1 2019-07-04 [1] CRAN (R 3.6.1) #> utf8 1.1.4 2018-05-24 [1] CRAN (R 3.6.1) #> vctrs 0.2.0 2019-07-05 [1] CRAN (R 3.6.1) #> withr 2.1.2 2018-03-15 [1] CRAN (R 3.6.1) #> xfun 0.9 2019-08-21 [1] CRAN (R 3.6.1) #> yaml 2.2.0 2018-07-25 [1] CRAN (R 3.6.1) #> zeallot 0.1.0 2018-01-28 [1] CRAN (R 3.6.1) #> #> [1] /home/fabian-s/R/x86_64-pc-linux-gnu-library/3.6 #> [2] /usr/local/lib/R/site-library #> [3] /usr/lib/R/site-library #> [4] /usr/lib/R/library ```