scikit-image / scikit-image

Image processing in Python
https://scikit-image.org
Other
6.11k stars 2.24k forks source link

watershed labels some foreground pixels as 0 when using mask argument #5408

Closed jgostick closed 3 years ago

jgostick commented 3 years ago

Description

The watershed function accepts a mask argument, which in principle tells it to only segment the pixels/voxels where the mask is true. However, a small number of pixels/voxels are being labelled 0 despite being in the foreground of the mask.

Way to reproduce

import numpy as np
import scipy.ndimage as spim
from skimage.morphology import disk
from skimage.segmentation import watershed
from skimage.feature import peak_local_max
import matplotlib.pyplot as plt

# Generate an image of spheres for segementing
np.random.seed(1)
s = np.random.rand(300, 300) < 0.999
dt = spim.distance_transform_edt(s)
im = dt > 12

# Compute the markers, blur dt and use larger footprint than default
dt = spim.gaussian_filter(dt, sigma=0.4)
peaks = peak_local_max(dt, footprint=disk(6))
peak_mask = np.zeros_like(im, dtype=bool)
peak_mask[tuple(peaks.T)] = True
markers = spim.label(peak_mask)[0]

# Now do watershed with and without mask
ws1 = watershed(-dt, markers)
ws1 = ws1 * im  # Apply 'mask' to regions after the fact
ws2 = watershed(-dt, markers, mask=im)

# The two images are not identical
assert np.all(ws1 == ws2)

# Although the look pretty good by eye
fig, ax = plt.subplots(1, 2)
ax[0].imshow(ws1/im)
ax[1].imshow(ws2/im)

# We can see that some parts of the foreground have not been labelled in ws2
print(np.sum((ws2[im]) == 0))  # 11 pixels are labelled 0
# While all the foreground in ws1 is labelled
print(np.sum((ws1[im]) == 0))

Version information

# Paste the output of the following python commands
3.8.10 | packaged by conda-forge | (default, May 11 2021, 06:25:23) [MSC v.1916 64 bit (AMD64)]
Windows-10-10.0.19042-SP0
scikit-image version: 0.18.1
numpy version: 1.20.2
jni commented 3 years ago

Thanks for the report @jgostick!

After exploring for a bit, I've concluded that this is not a bug but rather an unexpected property of the watershed. Specifically, if you plot the 11 pixels (plt.imshow((ws2 == 0) * im)), you will see that there is no foreground-only path from any marker to those pixels. In other words, the masked watershed forbids propagating labels outside the masked region, and so those pixels can never be labeled. (See the white pixels in the screenshot below.) This is a feature, not a bug — the masked region could be much, much smaller than the full image, so you can get a very efficient watershed when using masks.

But I understand that it is a surprising result. I wonder if you have any ideas about how this could be documented somewhere? Perhaps we should add it to the watershed gallery example?

Screen Shot 2021-05-31 at 10 43 37 am
jgostick commented 3 years ago

Firstly, I love that visualization, showing the regions and the dt so nicely. I'm going to borrow that in future!

Now that you explain it, this makes perfect sense, and I'm annoyed for not seeing it myself and wasting your time. The reason that I was 'tricked' was because my mask was literally just the foreground of the image, which I expected to be exactly the same as the unmasked result, which also only applies to the foreground...but I guess the difference is that unmasked version allows propagating labels through the background, right?

Why not issue a warning when any masked pixels are not labelled, like:

if np.any((regions*mask) == 0):
    print('Warning, some disconnected foreground pixels have not been assigned a label')

We issue plenty of such warnings in PoreSpy, but I'm realizing now that skimage rarely seems to warn me about anything...which much be some deliberate decision?

jni commented 3 years ago

Firstly, I love that visualization, showing the regions and the dt so nicely.

Thanks! I used napari for the visualization. I added the following code at the end of your script:

import napari

v = napari.view_image(im)
v.add_image(dt)
v.add_points(peaks, size=2)
v.add_labels(ws1)
v.add_labels(ws2)
labels = v.add_labels((ws2 == 0) * im)
labels.color = {1: (1, 1, 1, 1)}

napari.run()  # not needed in IPython

Then played with the visibility and opacity of various layers.

Why not issue a warning when any masked pixels are not labelled, like: We issue plenty of such warnings in PoreSpy, but I'm realizing now that skimage rarely seems to warn me about anything...which much be some deliberate decision?

It's not a bad idea. The principle we follow in skimage is that warnings should be easy to silence if you know what you're doing. The reasoning is that issuing too many warnings trains users to simply ignore them. In this case, I don't see a nice way to silence this other than warn_on_disconnected_pixels=False or a similarly ugly parameter. But it's worth considering!