Open cudmore opened 4 years ago
Hey! I am doing some initial checking as to what may be going on but hard to tell right now. Can I ask how much memory your machine has?
This is the batch processing function provided in aics-segmentation
. It is nothing more than looping through the files one by one.
Adding dask support seems to be an important feature for the next release. I will look into it.
I am trying to run a few aics-segmentation functions on a dask array so I can process a number of stacks in parallel.
For example
aicssegmentation.core.vessel.filament_3d_wrapper
... 1) If I run it on a dask array of length 1, it completes 1x stack in ~20 seconds with minimal CPU usage. This is about the same as running without a wrapping dask array ... good. 2) If I run it on a dask array of length 4, it completes each 1x stack in ~600 seconds with CPU looking like the 1x case. The 4x stacks are run in parallel but are not increasing CPU usage and are ~30 times slower than a 1x stack? [update], ran it again with a np.float and each call tofilament_3d_wrapper
when run across 4x stacks took ~1240 seconds, yikes!I started looking at the source and after some tracing came up with no obvious reason. All I see is normal Python/NumPy/SciPy code? Seem to remember that aics-segmentation has a set of batch functions? Should I use that instead? Any links to example code?
Here is some sample code. In particular, scipy.ndimage.median_filter seems to work fine (runs in parallel and maxes out CPU) but filament_3d_wrapper runs >30x slower and does not max out the CPU (looks like usage at 1x stack).