kaneplusplus / bigmemory

126 stars 24 forks source link

Memory blow-up with multiple calls to attach.resource() #94

Closed nbenn closed 5 years ago

nbenn commented 5 years ago

I'm encountering surprising behavior, where multiple calls to attach.resource() inflates the amount of memory reported by top as VIRT. The following example was run under R 3.5.1., bigmemory 4.5.33 and CentOS 7:

memuse::Sys.procmem()[["size"]]
#> 292.039 MiB
a <- matrix(rnorm(1e8), ncol = 10L)
memuse::mu(a)
#> 762.940 MiB
b <- bigmemory::as.big.matrix(a, shared = TRUE)
c <- bigmemory::describe(b)
d <- replicate(100L, bigmemory::attach.resource(c))
memuse::Sys.procmem()[["size"]]
#> 76.290 GiB

Is this expected behavior? The reason I'm asking is that I wanted to create many sub.big.matrixes corresponding to a large shared big.matrix and the reported memory usage quickly escalated to TBs.

privefl commented 5 years ago

Maybe just an artifact of {memuse}? Do you have memory problems when you do that? (freezes)

You could try using filebacked big.matrix objects instead.

nbenn commented 5 years ago

I see the same behavior for a file backed big.matrix object as for shared memory.

b <- bigmemory::as.big.matrix(a)

What do you mean by "artifact"? When I inspect the process with top, I see the same inflated number under the VIRT column. The RES column (arguably a better indicator for actual memory usage) reports a more reasonable number.

I'm running this on a shared resource and using 3.6 TiB of (virtual) memory for a small job (12 GiB matrix size) makes me feel a bit uneasy. Furthermore it does not feel like this would scale to a larger job (~100 GiB matrix size).

privefl commented 5 years ago

I can run your code on my laptop. There is no way I have that much memory.

By artifact, I mean the same thing happens with object.size():

> object.size(a)
800000216 bytes
> object.size(list(a, a, a, a, a))
4000001176 bytes

Whereas, there is no actual copy of a when creating the list. It is counting several times the same object in memory.

nbenn commented 5 years ago

@privefl I think this can be closed again. The reason why I opened the issue boils down to how the memory usage values reported by memuse::Sys.procmem() currently do not mean the same thing under linux and macOS (see shinra-dev/memuse#8). Therefore in a sense, you were right: this could be considered an 'artefact' of memuse.