kaskr / adcomp

AD computation with Template Model Builder (TMB)
Other
176 stars 80 forks source link

memory issue with multithreadig #316

Closed btamasi closed 4 years ago

btamasi commented 4 years ago

I'm trying to reassign a TMB object in a loop, and I noticed a memory leak/release problem. The example below uses linreg_parallel.cpp:

library("TMB")
compile("linreg_parallel.cpp")
dyn.load(dynlib("linreg_parallel"))

set.seed(123)
x <- seq(0, 10, length=50001)
data <- list(Y=rnorm(length(x)) + x, x=x)
parameters <- list(a=0, b=0, logSigma=0)

for (i in 1:1000) {
  obj <- MakeADFun(data, parameters, DLL="linreg_parallel")
}

If I run the for loop long enough, it gradually fills up my memory until the process gets killed. When I set openmp(1) I don't experience the problem.

The memory is only released when I terminate the R session.

Versions and OS info:

R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04 LTS
TMB_1.7.16
kaskr commented 4 years ago

@btamasi Thanks for the report.

The problem seems to depend on the version of R you are using: I could replicate it on a server running R-3.6.2 but not on another machine running R-3.4.4 (Example 1 below).

It seems the number of threads set by TMB::openmp() somehow interferes with the garbage collector (Example 2 below).

Example 1

From shell set 1Gb memory limit

ulimit -v 1000000

From R

library(TMB)
openmp(4)
runExample("linreg_parallel")
for (i in 1:100) obj <- MakeADFun(data, parameters, DLL="linreg_parallel")

Example 2

Modify example 1 to call gc() in between:

loop <- function() {
  for (i in 1:100) {
    gc()
    obj <- MakeADFun(data, parameters, DLL="linreg_parallel")
  }
}

Then try

openmp(1); loop()  ## Works (frequent Free parallelADFun)
openmp(2); loop()  ## Works (frequent Free parallelADFun)
openmp(4); loop()  ## Fails (less frequent Free parallelADFun)
btamasi commented 4 years ago

For me (on R-4.0.2), explicitly calling the garbage collector doesn't help. I don't see any Free parallelADFun object. in your Example 2 with openmp(2); loop().

kaskr commented 4 years ago

A new function FreeADFun has been added to handle situations where gc() doesn't clean up as frequently as expected. I tested that the following snippet runs in R-4.0.2 without accumulating memory:

library(TMB)
openmp(4)
runExample("linreg_parallel")
for (i in 1:1000) { FreeADFun(obj); gc(); obj <- MakeADFun(data, parameters, DLL="linreg_parallel") }