Closed Xqua closed 10 years ago
Hi Xqua. How big is N? How big is the pipeline?
There's an infinite undo for pipeline editing in CellProfiler. The undo even works across pipeline loads - perhaps this is a little weird, but in any case it's there. The undo includes operations that add files to the file list and, for efficiency reasons, we attach a Java object that contains the metadata to each file on the file list. For high-throughput screening, I guess that list could end up being very large and could eat up a lot of Java memory. The Java metadata is basically a stale cache of the computed metadata and it can be discarded when the file item gets pushed onto the undo stack - I think it's not a show-stopper, but I will fix this after the release.
It's big, I am analyzing microscopy scans of 96-wells plates, with pipelines of around 60K files. I had to set up the jvm-heap-space to be able to load the files but that's a normal behavior. I know it's not a huge deal, but maybe clearing the cache on loading a new pipeline would help ? Thanks !
It is a big screen - not the largest we've seen, but comparable to one of our large ones. Our image assay developers often use a trick for large screens - they use the user interface to collect the files and organize the image sets, then export the image sets to a CSV (File / export / image set listing), and then load that listing using LoadData. You'll find that CellProfiler is much more responsive and if I remember it correctly, CP should not consume Java memory when using LoadData.
You'll get some warnings about LoadData being legacy, but we fully intend to support and in situations like yours encourage people to use LoadData.
On Fri, Jan 10, 2014 at 1:51 PM, Xqua notifications@github.com wrote:
It's big, I am analyzing microscopy scans of 96-wells plates, with pipelines of around 60K files. I had to set up the jvm-heap-space to be able to load the files but that's a normal behavior. I know it's not a huge deal, but maybe clearing the cache on loading a new pipeline would help ? Thanks !
— Reply to this email directly or view it on GitHubhttps://github.com/CellProfiler/CellProfiler/issues/1008#issuecomment-32053996 .
When loading pipelines, the memory isn't cleared in between loads which ends up causing a Out of Memory (Java Heap Space) after n load. Thus one has to close and reopen the Software.
This is not critical, but I though it should be mentioned.