MichaelChirico / r-bugs

A ⚠️read-only⚠️mirror of https://bugs.r-project.org/
20 stars 0 forks source link

[BUGZILLA #15363] memory used unexpectedly grows in recent versions of R (win 32) #4951

Closed MichaelChirico closed 4 years ago

MichaelChirico commented 4 years ago

Differently from old versions of R, replacing part of a matrix makes memory usage grow quickly; replacing the whole matrix does not. Code to reproduce this behaviour:

NClu <- 378 nr <- 100000 ns <- 5e6 NRIP <- ceiling(ns/nr) vett_p <- matrix(0, nrow=1, ncol=10) print(memory.size()) for (j in 1:NRIP) { for (s in 1:nr) {
for (i in 1:NClu) { vett_p[1,1:10] <- (1:10)/15
} } cat(j, "of", NRIP, " "); print(Sys.time()); print(memory.size()) }

Under R 3.0.1 (and R 2.15.2, too) the above code results in a growing memory occupation, according to memory.size(), until it stops with out-of-memory. Here is a partial output from running the above code: [1] 14.05 1 of 50 [1] "2013-06-21 13:21:22 CEST" [1] 44.03 2 of 50 [1] "2013-06-21 13:26:46 CEST" [1] 74.52 3 of 50 [1] "2013-06-21 13:33:12 CEST" [1] 100.47 4 of 50 [1] "2013-06-21 13:40:45 CEST" [1] 137.83 ..... 17 of 50 [1] "2013-06-21 16:56:38 CEST" [1] 541.65 18 of 50 [1] "2013-06-21 17:19:08 CEST" [1] 541.65 Error: cannot allocate vector of size 631.4 Mb

At this point, sum of object.size()’s is not large:

sum(sapply(ls(), function(ZZ) object.size(get(ZZ))))

[1] 464

Running gc() after the script terminates with the error does not reduce memory allocated:

gc()
       used  (Mb) gc trigger   (Mb)  max used  (Mb)

Ncells 184333 5.0 407500 10.9 407500 10.9 Vcells 69135669 527.5 133235439 1016.6 126911523 968.3

memory.size()

[1] 536.69

In the code above I do not see anything explicitly growing in size, or requiring a lot of RAM. Furthermore, replacing the statement vett_p[1,1:10] <- (1:10)/15 with vett_p <- (1:10)/15
does not increase memory usage.

Also, running the same code in R 2.6.2 does not fail: [1] 11.05479 1 of 50 [1] "2013-06-21 13:21:37 W. Europe Daylight Time" [1] 12.47408 2 of 50 [1] "2013-06-21 13:25:56 W. Europe Daylight Time" [1] 12.49328 .... 49 of 50 [1] "2013-06-21 16:51:06 W. Europe Daylight Time" [1] 12.48106 50 of 50 [1] "2013-06-21 16:55:25 W. Europe Daylight Time" [1] 12.47752

In all cases I used pre-built windows binaries provided by CRAN, launched with: RGui.exe --vanilla

sessionInfo()

R version 3.0.1 (2013-05-16) Platform: i386-w64-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=Italian_Italy.1252 LC_CTYPE=Italian_Italy.1252
[3] LC_MONETARY=Italian_Italy.1252 LC_NUMERIC=C
[5] LC_TIME=Italian_Italy.1252
attached base packages: [1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached): [1] tools_3.0.1

# hardware/OS: Windows Server 2003 build 3790, Service Pack 2 (32bit, machine has 8 cores, 16GB RAM) # also tested with R 3.0.1 on: Windows 7 Enterprise Service Pack 1 (32bit, machine has 2 cores, 1.75 GB RAM)


METADATA

MichaelChirico commented 4 years ago

I can reproduce it on Linux as well (also in R-devel). The key is to have millions of iterations, so the leak is minuscule, but accumulates over time. Simplified example to reproduce:

vett_p <- matrix(0, nrow=1, ncol=10) for (i in seq.int(1e7)) vett_p[1,1:10] <- 0 gc()

It seems like a vcells leak or miscount, using DEBUG_GC the count for all classes after each time running the loop stays constant, but Vcells increase:

[...]

for (i in seq.int(1e7)) vett_p[1,1:10] <- 0
gc()

Node counts after heap adjustment: Class: 0, New = 12977, Old = 25908, OldToNew = 0, Total = 38885 Class: 1, New = 2423, Old = 4817, OldToNew = 0, Total = 7240 Class: 2, New = 954, Old = 1111, OldToNew = 0, Total = 2065 Class: 3, New = 486, Old = 1296, OldToNew = 0, Total = 1782 Class: 4, New = 349, Old = 179, OldToNew = 0, Total = 528 Class: 5, New = 141, Old = 183, OldToNew = 0, Total = 324 Class: 6, New = 156, Old = 306, OldToNew = 0, Total = 462 Class: 7, New = 0, Old = 184, OldToNew = 0, Total = 184 used (Mb) gc trigger (Mb) max used (Mb) Ncells 33984 1.9 350000 18.7 350000 18.7 Vcells 4653604 35.6 14283348 109.0 13486130 102.9

for (i in seq.int(1e8)) vett_p[1,1:10] <- 0
gc()

Node counts after heap adjustment: Class: 0, New = 12970, Old = 25915, OldToNew = 0, Total = 38885 Class: 1, New = 2423, Old = 4817, OldToNew = 0, Total = 7240 Class: 2, New = 952, Old = 1113, OldToNew = 0, Total = 2065 Class: 3, New = 477, Old = 1305, OldToNew = 0, Total = 1782 Class: 4, New = 348, Old = 180, OldToNew = 0, Total = 528 Class: 5, New = 141, Old = 183, OldToNew = 0, Total = 324 Class: 6, New = 155, Old = 307, OldToNew = 0, Total = 462 Class: 7, New = 0, Old = 184, OldToNew = 0, Total = 184 used (Mb) gc trigger (Mb) max used (Mb) Ncells 34004 1.9 350000 18.7 34004 1.9 Vcells 27805224 212.2 91179667 695.7 27805224 212.2

It is unclear if this is related to just reporting -- it could be, but the process does seem to be using memory: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10296 urbanek 20 0 180m 115m 3716 R 100 0.0 0:09.97 R

All of the above is using R-devel r63070


METADATA

MichaelChirico commented 4 years ago

The change that caused this example to start running up the Vcells is

luke@<::CENSORED -- SEE ORIGINAL ON BUGZILLA::>-Latitude build% svn log -r 56222 ../R ------------------------------------------------------------------------ r56222 | luke | 2011-06-26 05:37:40 -0500 (Sun, 26 Jun 2011) | 2 lines

Made .GlobalEnv a hashed environment.

------------------------------------------------------------------------

So it looks like some issue with the hashed environment implementation. To confirm, in current R-devel (r63072)

vett_p <- matrix(0, nrow=1, ncol=10)
for (i in seq(1e7)) vett_p[1,1:10] <- 0
gc()

has the problem but

local({
    vett_p <- matrix(0, nrow=1, ncol=10)
    for (i in seq(1e7)) vett_p[1,1:10] <- 0
    gc()
}, envir = new.env(hash = FALSE))

does not.

I may not be able to look further for a few days but I will when I get a chance unless someone else gets there first.

[Repeated gc() calls are also growing Vcells (by 12), even when.GlobalEnv is not hashed -- may be the same issue with another hashed env or something else.]


METADATA

MichaelChirico commented 4 years ago

The problem is in "unbindvar" in envir.c. A comment says that the code for unbinding a variable in a hash table is currently unused, but this is not true. In particular, it is used to unbind the temporary variable created in applydefine in eval.c. Because this code does not decrement the count of how many hash slots are used, this count can grow, causing the hash table to be resized to larger and larger sizes.

The problem also exists in pqR-2013-06-20. I'll be releasing a new version of pqR that fixes it shortly.

There seems to be a similar problem in RemoveVariable, used by "rm".


METADATA

MichaelChirico commented 4 years ago

This is now fixed in 63150 in the trunk and in 63151 in R-3-0-branch by having R_HashDelete and RemoveVariable in envir.c decrement the number of chains in a hash table when removing a variable empties its chain. It would be better to rewrite the envir.c code to use macros that maintain this invariant so this doesn't slip again.


METADATA