ANTsX / ANTsR

R interface to the ANTs biomedical image processing library
https://antsx.github.io/ANTsR
Apache License 2.0
127 stars 35 forks source link

Memory usage increase in repeated antsImageRead/Write #306

Closed muratmaga closed 4 years ago

muratmaga commented 4 years ago

I have a very large number of files (~23K) that I would like to reset the image scale. If I run this piece of code, the memory consumption linearly increases and quickly crashes the system.

for (i in 1:length(f)) {
  temp = antsImageRead(f[i])
  antsSetSpacing(temp, c(1,1))
  antsImageWrite(antsImageClone(temp, 'unsigned char'), filename = temp@filename)
  print(i)
}

If I insert

remove(temp)
gc()

after the antsImageWrite, I no longer see the memory increase, but gc slows down things quite a bit. I thought temp would have written each time, so wouldnt cause a memory increase. Is it actually the antsImageClone() causing the inflation in the memory usage?

Describe the bug A clear and concise description of what the bug is.

To Reproduce Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Smartphone (please complete the following information):

Additional context Add any other context about the problem here.

gdevenyi commented 4 years ago

As far as I understand R, this is as expected based on how the memory allocator works, confirmed by you adding the extra code to force a cleanup early.

This problem is really screaming for a bash script instead, it'll be much faster and you'll avoid any of this memory issue, can take advantage of gnu-parallel.

dorianps commented 4 years ago

If I remember correctly, Unreported this issue here few years ago, and we found it was happening in all platforms. Anyway, this is the reason while I tend to add gc calls in my loops. You can search previous memory issues to dig out what was said back then.

On Mon, Apr 6, 2020, 9:28 PM Gabriel A. Devenyi notifications@github.com wrote:

As far as I understand R, this is as expected based on how the memory allocator works, confirmed by you adding the extra code to force a cleanup early.

This problem is really screaming for a bash script instead, it'll be much faster and you'll avoid any of this memory issue, can take advantage of gnu-parallel.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ANTsX/ANTsR/issues/306#issuecomment-610121848, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACFJU7LGRMZNV22MTENTCZ3RLJ6T7ANCNFSM4MB2D46Q .

dorianps commented 4 years ago

Errata corrige: I reported it and was happening only on some platforms, not all.

On Mon, Apr 6, 2020, 9:42 PM Dorian Pustina albnet@gmail.com wrote:

If I remember correctly, Unreported this issue here few years ago, and we found it was happening in all platforms. Anyway, this is the reason while I tend to add gc calls in my loops. You can search previous memory issues to dig out what was said back then.

On Mon, Apr 6, 2020, 9:28 PM Gabriel A. Devenyi notifications@github.com wrote:

As far as I understand R, this is as expected based on how the memory allocator works, confirmed by you adding the extra code to force a cleanup early.

This problem is really screaming for a bash script instead, it'll be much faster and you'll avoid any of this memory issue, can take advantage of gnu-parallel.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ANTsX/ANTsR/issues/306#issuecomment-610121848, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACFJU7LGRMZNV22MTENTCZ3RLJ6T7ANCNFSM4MB2D46Q .

stnava commented 4 years ago

as discussed in the other thread, the automated memory management in R has mysterious behavior. in brief, though, if it can be cleaned up by gc, it's not a memory leak.

we had a problem like this in antspy that was reproducible across all platforms and garbage collection couldn't touch it. a real memory leak.

that being said, if it was possible to isolate the effect to a single part of the call (e g the clone), that would help identifying implementation issues.

cookpa commented 4 years ago

I think it was consistent on all platforms. I had originally thought Mac was not affected, but I think that was a mistake.

As explained in this thread (see post by Jon Clayden), I think the issue is that R does not know about memory allocated by C++ code. It sees the R object for an antsImage, but as far as R is concerned that is a small object, and low priority for GC. Only when a GC is invoked by some other means does that object get cleaned up.

https://github.com/ANTsX/ANTsR/issues/111

muratmaga commented 4 years ago

Thanks for all the explanations, it all makes sense. Bash script and parallelization is good suggestion too.