Closed MariusCausemann closed 5 months ago
Hi Marius,
Thanks for writing in, I'm glad you're finding these tools useful!
The method I'm currently using is not suitable for multi-label images. However, it does seem like it would be possible to devise one. In your opinion, would it be useful to define the multilabel operator as:
Will
Hi Will, thanks for your quick reply! Currently, I mainly work with option 1 (all "on", labels can dilate into other labels) and sometimes option 2 (labels only dilate in background voxels). It would be optimal if one could choose between all those options, but I would already be happy with either of them. Do they differ a lot in their complexity, both for implementation effort and runtime?
Marius
Hi Marius,
I think I was a bit unclear. The list was not options, but a description of the whole system (so all three properties at once). It would be possible to make certain properties configurable.
What does dilation look like to you in the case where two labels are touching?
Will
Hi Will, I think two options would be reasonable for touching labels: a) don't dilate them at their interface and b) dilate only one (and consequently erode the other). Do you think this could be configurable?
So here's the good news, I have an untested prototype of (a) here. https://github.com/seung-lab/fastmorph/pull/2
For (b), there would have to be some arbitrary criterion as to which shape got eroded by the other, such the max label id wins. Would that be acceptable?
Great! I'll have a look at the prototype and test it out soon. The max label id winning is totally acceptable -as you say, it's arbitrary anyway. Thanks a lot for your effort!
After thinking about it a bit, it might make sense to pass in a list of label ids? The order of the ids in the list could then determine which label gets eroded and which dilated. Ideally, that would also allow dilating only a subset of the labels.
Interesting, but what happens when you pick two labels that are adjacent? Then you run into the same problem.
One issue with this approach is that the most obvious implementation requires a set or map with potentially frequent access. That would be a big performance hit (though not as big as having to process each individual shape as you have been doing).
I just realized there's a way to make those concepts work together. Make the base algorithm work based on smaller (or larger) numbers dominating, but accept a list in python and remap the volume in list order.
Nevermind, there's actually a much simpler solution that has a very simple implementation. I just pick the stencil mode at each point.
Speed trial on a 512^3 volume with >2000 labels.
Background only: 1.0 sec Allowing conflicts: 49.4 sec
Is that a reasonable way to go about things?
Check out the latest code in #2 there are dilation, erosion, closing, and opening operators that are reasonably efficient.
Great, this looks pretty good! I did a quick comparison with the corresponding scikit-image functions on a 500x500x500 image (small chunk of the cortical mm3 dataset, around 500 labels):
in background_only=False
mode:
import skimage.morphology as skim
In [50]: %timeit skim.dilation(img, skim.cube(3))
1.67 s ± 31.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [51]: %timeit fastmorph.dilate(img, background_only=False)
3.24 s ± 41.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
in background_only=True
mode:
from skimage.segmentation import expand_labels
%timeit expand_labels(img)
23.1 s ± 2.24 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit fastmorph.dilate(img, background_only=True)
2.7 s ± 64.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
So the background_only
mode is really fast, the background_only=False
is somewhat slower. My results look quite different from yours though, do you think this is due to image characteristics?
So I haven't extensively tested the code yet, so there could be bugs. However, there are a couple things that stand out to me.
skim.dilation
seems to take the maximum of the neighborhood, while I'm taking the mode. Maximum is a simpler operation to execute so it doesn't surprise me it's a bit faster though there are likely opportunities for further optimization in my code.
expand_labels
has an interesting implementation. It's pretty similar to my implementation of spherical_dilate
but scipy also can compute feature maps for its distance transform and it uses that to do the multi-label approach. Probably computing both the EDT and the feature map adds some time.
Can you share some results you find surprising? Also, what is your performance target?
Hi Will,
I wasn't aware of the difference with mode vs max between skim.dilation
and fastmorph.dilate
, it makes sense then!
I was surprised by the fact that in my test, fastmorph.dilate
is roughly equally fast with background_only=True
and False
, while you mention a 50x difference in runtime. Any idea why?
Regarding performance: It would be great if one could process images of up to 5k ^3 voxels in a reasonable time. My previous dask-image-based implementation took a few hours for about 1k^3 voxel images (could probably have been optimized further, but I thought it would be worthwhile to look for alternatives).
I haven't tested fastmorph
large images yet, but if it scales with O(n), a single dilation on a 5k^3 image should take around 1h. Given that the processing consists of a sequence of quite a few erosion/dilation operations, this would still take quite some time.
I've been experimenting with the erosion operator which is a bit simpler than dilation since the mode doesn't need to be computed.
The issue in getting additional single-core speedup has to do with the memory access pattern which cuts across the depth dimension regardless of which orientation (C or F) you use. There are some tricks like doing blocks of the image to ensure L1/L2 cache friendliness, but I've never personally had much success with them. I could give it a try.
Since you need to process very large images, I think the answer would unfortunately be multi-threading. Fortunately, these algorithms are pretty trivially parallel so that's something that could be considered.
I'll need to do more experiments to understand why you are seeing background_only=False
getting such good performance. In that mode, the stencil needs to retrieve 10 voxels of data at every voxel. In True
mode, only background voxels need to be considered, so a large amount of work can be skipped in densely labeled data.
I gave the block approach a try and didn't see a big benefit (small penalty). I suspect the way forward is multithreading. I have a prototype that is mostly working for dilate.
Ok, I uploaded threaded dilation.
Just fixed a bug. I think with multi-threading, at least for a 5000^3 volume, you should be down to under an hour per run. Still could use some work, but hopefully that puts you in an acceptable place. I'm trying to fix the build systems for py312 but hopefully I can make a release soon.
I also found some big speedups for dilate background_only=False
though it's possible your compiler found some of them before I did.
Version 1 has been released on PyPI! Let me know how it works out for you.
On my standard test volume, I tried the following:
s = time.time()
res = fastmorph.dilate(labels, parallel=8, background_only=False)
print("dilate / background_only=False / 8 threads: ", time.time() - s)
import skimage.segmentation
s = time.time()
res = skimage.segmentation.expand_labels(labels)
print("skimage dilate / 1 thread: ", time.time() - s)
The results were:
dilate / background_only=False / 1 thread: 14.205087900161743
dilate / background_only=False / 2 threads: 7.286011219024658
dilate / background_only=False / 4 threads: 4.977150917053223
dilate / background_only=False / 8 threads: 4.070333003997803
skimage dilate / 1 thread: 68.5935742855072
Memory usage looked like so:
At about 5 seconds, you can see fastmorph.dilate running. Around 20 seconds you can see scikit-image. The memory usage is much higher as well in skimage.
I'm gonna close this since it's been implemented for a while. I'm finding it useful for my own purposes!
Hi, I'm developing a pipeline to generate meshes suitable for finite element simulations from segmented electron microscopy images of brain cells. I found your software (cloudvolume, fastremap, ...) very useful for this, thank you! However, one missing piece for me is a fast method for morphological operations on multilabel 3D images. To keep the number of points in my meshes small, I need to smoothen and simplify the cellular geometries and use a sequence of morphological erosion, dilation, opening, closing, etc for this. So far, it's based on scipy.ndimage / dask.image, but this only supports binary images, requiring looping over all labels in the dataset.
Is there a way to add multilabel support for fast multilabel dilation to this package? Or do you have a pointer to some other package for me? Thanks!
Marius
Small illustration a the resulting FEM mesh, intended for e.g. electrophysiology, fluid flow, cell swelling simulations: