Llamero / Local_Hough_Circle

An improved implementation
GNU General Public License v3.0
10 stars 1 forks source link

Trouble with large stacks #2

Closed Joel-O closed 6 years ago

Joel-O commented 7 years ago

I am unable to run the transform against a large stack (3600 or 1800 images). The transform looks like is starts, memory & CPU usage increase but then CPU drops to 0% and the ImageJ status bar reverts back to normal. I've had success testing a subset of the data (900 images). Thanks,

Update: I can run 900 images if I do not check circle outlines overlaid... (export to table checked). With circle outlines overlaid checked the transform starts but then ends without any indication. Low memory? 32GB installed, 3.8GB used before starting ImageJ, 4.2 used with ImageJ running, 7.8 used with stack loaded, 28.3 used while transform is running. Edit-Options-Memory & Threads-Max Memory=24477MB

I can not run 1800 images even if circle outlines overlaid is unchecked. Once 23.8GB is used closing the Transform dialog & image stack does not free memory. I need to close ImageJ completely to free memory.

Update #2: moved to a different workstation, was able to run against 3600 image stack successfully using 91 of 128 GB with circle outlines overlaid unchecked. With circle outlines overlaid checked failed using 105 of 128 GB.

Llamero commented 7 years ago

This definitely sounds like a memory issue, as we have run the transform on movies with over 10,000 frames without issue. To test the memory requirements, I ran the local transform on a 10,487 frame movie, searching for one circle in each frame, and generating a 16-bit (2 GB) centroid map and results table as outputs. Running the code used 7.3 GB of RAM. and took 50 seconds to complete. Each frame was 588x180 pixels of 8-bit data. That said, the amount of RAM needed will be very much dependent on the size of the the transform (the number of radii being tested).

Could you send me a sample piece of data and record the Hough transform command (Plugins -> Macros -> Record) when you run it. This will allow me to test the same data under the same conditions, and I may also be able to give a few tips on how to optimize things.

Worst case scenario would be to write a macro that split the movie into blocks, runs the transform on each block, and the concatenates the data back together at the end.

Llamero commented 7 years ago

And one other note, ImageJ runs in Java, which runs as a virtual machine on your computer. Bascially, Java is running as its own operating system, ported through the main operating system you are using on your computer. As a result, ImageJ will request RAM as needed, and then will hold onto that RAM indefinitely. This doesn't mean the RAM is necessarily in use, but rather that Java as set it aside for itself, to use when needed. To clear out any data in ImageJ's reserved RAM, just double-click on the ImageJ status bar. It will then tell you how much RAM it is using and how much RAM is available, such as "70MB of 60000MB (<1%)."

When you close ImageJ, you are also closing the instance of the Java virtual machine, which means that the virtual machine returns the RAM it was reserving for itself back to the main operating system.

You can get a better idea of how this works by opening ImageJ and then loading a large image or movie. As the movie is imported into ImageJ, you will see the RAM usage in your operating system go up, as the virtual machine continues to request more RAM to be able to load the image. Double click on the status bar, and you should see that ImageJ is now using the same amount of RAM as the size of the movie plus approx. 35MB to run ImageJ itself.

Now, close the image, and you will see that the RAM in use by ImageJ will have dropped back to about 35MB, but the RAM usage in the operating system may not have dropped. Now, load the same image again. This time, you will see that as the image loads, ImageJ does not take any more RAM from the operating system. This is because the RAM it already has reserved is just enough to hold the image, and therefore it does not need to request any more.

Hope this helps.

Joel-O commented 7 years ago

Thanks for the prompt response and the Java/memory usage explanation; I was unaware, good to know. I just got assigned a priority project so I don’t have time to help you until I get things under control. I will upload two images from the 3600. The “ring” is slightly elliptical and I am trying to fit a circle to either small end and then calculate the midpoint of circle centers on my own from the transform results. The ring is moving few pixels in X & Y at a time (I uploaded 1 & 5 so there are 3 images in-between, in the transform I can cut the Local R parms to 3).

My process is: File – Import – Image Sequence, to open all 3600 images as a stack Hough Circle Transform – screen shot below, Circle outlines overlaid… as described in my initial post

1

Update: Getting unsupported file type on tif file, will try to zip them. Tif.zip

Llamero commented 7 years ago

2000x2000 is a pretty large space to transform. The total memory needed then for each transform is 4220002000*4 = 672MB. The transform over-writes the transform space each frame to save memory, but the fact that Java runs garbage collection intermittently means that memory can pool up at times. To dramatically speed things up, I would bin the data using Image -> Transform -> Bin and choose a min bin (a bit counter intuitive but a vestige of old ImageJ) since the circles are only a single pixel wide. Even a 10x10 bin preserves the circle, and this reduces the memory needed 100x. One other thing that won't change the memory but will boost the speed would be to check the reduce box. This will remove all redundant transform steps.

One final thing to try would be a standard Hough transform vs a local Hough transform. The standard multithreads by dividing the transform space into blocks of radii to be tested, with each core doing a different subset of radii. The local transform multithreads by having each core look for a different circle. Since you only have one circle, the standard transform may be faster. The reason why is that the standard transform only transforms mask pixels (255) and ignores background pixels (0). Since the circle is the only thing in the frame, the standard transform will possibly wind up doing fewer transforms than the local. The difference in speed will be system dependent, so the only real way to know for sure is to run both.