Closed caspervdw closed 9 years ago
It might be worth seeing what happens with a 2GB heap? Or is there not much memory available?
Thanks for the response, I actually want to have some memory left for my analysis so that will not be a good solution for me. But I will run the test overnight to check if it changes anything.
It does sound like someone is holding onto stuff as the file is being read. I would try letting go of the reader periodically - for instance by getting a new reader once per timepoint. I don't think it will be so much overhead. You should also clear the cache to make sure it's clearly gone:
from bioformats.formatreader import clear_image_reader_cache
clear_image_reader_cache()
It may be a problem in Bio-formats itself. I've seen some traffic come across about .nd2 files, perhaps this? https://github.com/openmicroscopy/bioformats/pull/1351
We recently upgraded the BF version, maybe it includes a fix. But the safest bet may be the work-around. Please close and comment how you resolved it if you get it to work.
Disabling original metadata population may also help, given the number of images. Perhaps see if calling rdr.setOriginalMetadataPopulated(false)
before rdr.setId(...)
makes any difference?
I increased the heap size and that solved my problem. But I would like to have the reader as slim as possible so it's worth to look into this memory usage build up. I will try out your comments and report back if I've found something. For testing, I would like to know if there is some way to investigate the current heap size with the javabridge?
One cool trick is to use remote debugging to let Eclipse have access to the JVM inside CellProfiler (see http://blog.trifork.com/2014/07/14/how-to-remotely-debug-application-running-on-tomcat-from-within-intellij-idea/ for instance, although I set this stuff by passing the argument, "-agentlib:jdwp=transport=dt_socket,address=127.0.0.1:8001,server=y,suspend=n" in javabridge.start_vm(). Then, you can use the Eclipse Memory Analyzer ( https://eclipse.org/mat/). The combination of being able to debug in Eclipse and look at the memory is awesome.
On Thu, Jan 22, 2015 at 10:04 AM, Casper van der Wel < notifications@github.com> wrote:
I increased the heap size and that solved my problem. But I would like to have the reader as slim as possible so it's worth to look into this memory usage build up. I will try out your comments and report back if I've found something. For testing, I would like to know if there is some way to investigate the current heap size with the javabridge?
— Reply to this email directly or view it on GitHub https://github.com/CellProfiler/python-bioformats/issues/21#issuecomment-70990277 .
I made some progress, I solved the issue by removing the calls to reader.rdr.getMetadataStore().getPlanePositionX
.
The issue is now narrowed down to the MetadataStore. I can reproduce it by doing the following:
jmd = javabridge.JWrapper(reader.rdr.getMetadataStore())
for i in range(1000000):
jmd.getPlanePositionX(0, 0)
It starts slowing down from 1.5 ms to 5 ms per call from iteration 170 000 onwards and reproduces the GC overhead limit exceeded
error at ~200 000 iterations. Any suggestions?
Do you need to access the metadata? You can call bioformats.formatreader.get_omexml_metadata(path) to get the raw XML version of the metadata. I have a feeling that may be the only work-around at present.
For me this issue was solved by changing to JPype as a java interface. Closing.
Hi,
I have a 50 GB Nikon nd2 file with 1000 frames of dimensions 101x512x512, and I do my image analysis by iterate over the frames using a code that boils down to:
I start my VM with 512MB heap space like this:
About halfway the t-loop, I get
JavaException: GC overhead limit exceeded
. Are there extra things I could do to free memory?Python stack:
Java stack: