scikit-image / scikit-image

Image processing in Python
https://scikit-image.org
Other
6.08k stars 2.23k forks source link

morphology.binary_closing() is ~50x slower than morphology.closing(). The ratio should be reversed #7458

Open vali19th opened 4 months ago

vali19th commented 4 months ago

Description:

I tried with both of these images: 0_and_255.png 0_and_1.png

Docs saying it should be faster: https://scikit-image.org/docs/stable/api/skimage.morphology.html#skimage.morphology.binary_closing

Way to reproduce:

import time
import numpy as np
from skimage import io, morphology

img = io.imread("tmp/x.png")[:, :, 0]
print(np.unique(img), img.shape, img.dtype)  # [0 1] (3508, 2480) uint8
kernel = np.ones((35, 35))

t = time.perf_counter()
morphology.binary_closing(img, kernel)
t2 = time.perf_counter()
morphology.closing(img, kernel)
t3 = time.perf_counter()

print(t2 - t)  # 5.420551682997029
print(t3 - t2)  # 0.08474353399651591

Version information:

3.11.9 (main, Apr 19 2024, 16:48:06) [GCC 11.2.0]  # conda environment
Linux-6.5.0-41-generic-x86_64-with-glibc2.35
scikit-image version: 0.23.2
numpy version: 1.26.4
michaelbratsch commented 2 months ago

The same is true for morphology.opening and morphology.binary_opening.

It is an issue with the underlying scipy implementation see https://github.com/scipy/scipy/issues/13991

There are also similar discussions in https://github.com/scikit-image/scikit-image/issues/1190, for example: https://github.com/scikit-image/scikit-image/issues/1190#issuecomment-572384985

Maybe it makes sense to change the documentation and not confuse people that the binary operations are faster.

stefanv commented 2 months ago

Rather bizarre!

The difference is even there for smaller kernels:

# 512x512

In [24]: %timeit ski.morphology.closing(bimage)
2.27 ms ± 24.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [25]: %timeit ski.morphology.binary_closing(bimage)
4.63 ms ± 30.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Computed results are identical. The effect is also reproducible on larger images:

# 2048x2048

In [7]: %timeit ski.morphology.closing(bimage)
38.2 ms ± 783 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [8]: %timeit ski.morphology.binary_closing(bimage)
76 ms ± 2.75 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
stefanv commented 2 months ago

I wonder if this is a regression in SciPy; it seems unlikely we would have written the guidance if it weren't true at the time. SciPy's docs state the same!

In terms of what to do, easiest for now may be to either (a) deprecate binary_dilation/closing or to simply have those functions dispatch to dilation/closing and remove the offending documentation.

vali19th commented 2 months ago

I think the large kernel makes the timing differences more extreme

lagru commented 2 months ago

Agreed, dispatching in binary_dilation/closing to the faster functions seems like the way to go for now. As well as an update to the docstrings.

michaelbratsch commented 2 months ago

I agree that dispatching is possible but what is the point of keeping binary_dilation/closing? It has fewer modes than dilation/closing and the original purpose was to have a faster interface for a specialised case which is not true anymore. So why bother dispatching and not go for deprecation? People can just use the other interface as a one-to-one replacement.

vali19th commented 2 months ago

Maybe the functions can be marked as deprecated and changed to do the dispatching at the same time. This way, the library users will get the speedup without changing their code.

michaelbratsch commented 2 months ago

I tried the dispatching approach and then stumbled upon the same issues as being noted in https://github.com/scikit-image/scikit-image/issues/7238, dilation and binary_dilation have some inconsistencies in the way they operate, see

Padding of even-sized SEs

and

Mirroring of SE in dilation

That means that dispatching binary_dilation to dilation is not possible in a straight forward way. I would expect the same to be true for binary_erosion. Spending much time to make it work does not seem appropriate to me because it is code we anyway want to get rid of.

I could provide a pull request to only mark the binary functions deprecated. Any thoughts on that?

stefanv commented 2 months ago

Thanks @michaelbratsch, that sounds good to me.

lagru commented 2 months ago

:+1: on the deprecation approach. As long as we explain how to use alternatives properly with regards to mirroring and even-sized footprints. :) Feel welcome to request a review from me @michaelbratsch if you make a PR.