Weird behaviour: size_min_max_filter

KoenDesplenter commented 9 months ago

I was playing around with different parameter configurations for the preprocessing step.

When using different size_min_max_filter values I discovered weird behaviour. When applying a filter of even numbers I get bad quality output, no matter what value being used. When using an odd number, the image quality remains decent and changes depending on the filter value.

How come that even values for the size_min_max_filter give this weird behaviour?

size_min_max_filter: 30 size_min_max_filter: 31 size_min_max_filter: 32

Code fragment:

for i in range(30, 50):
    print(i)
    sdata = nas.im.min_max_filtering(
        sdata=sdata,
        output_layer="min_max_filtered",
        size_min_max_filter=i,
        img_layer="tiling_correction",
        overwrite=True,
    )
    nas.pl.plot_image(
        sdata, img_layer="min_max_filtered", crd=crop_param, figsize=(8, 8)
    )

ArneDefauw commented 8 months ago

I could reproduce the issue. Probably dask-image/scipy tries to do something clever when filter size is even, which I can imagine can lead to issues... I have to do some more checks, but I thinks it makes sense to round to a non-even number if an even number is provided as max_filter

SilverViking commented 8 months ago

This code snippet can reproduce the problem on simple test data:

from scipy import ndimage
import numpy as np
import matplotlib.pyplot as plt
from skimage import data

# Instantiate classic test image 'camera'
img = np.array(data.camera(), dtype=np.uint16)
print(type(img))
plt.imshow(img); plt.show()

def minmax_filter(img, size):
    result = ndimage.minimum_filter(img, size=size)
    result = ndimage.maximum_filter(result, size=size)
    result = img - result  # beware - may trigger underflow!
    print(f'filter size={size}, dtype={result.dtype}, min={np.min(result)}, max={np.max(result)}')
    plt.imshow(result); plt.show()

minmax_filter(img, 40)
minmax_filter(img, 41)

minmax_filter(np.array(img, dtype=float), 40)
minmax_filter(np.array(img, dtype=float), 41)

The output is:

<class 'numpy.ndarray'> filter size=40, dtype=uint16, min=0, max=65535

filter size=41, dtype=uint16, min=0, max=239

filter size=40, dtype=float64, min=-176.0, max=236.0

filter size=41, dtype=float64, min=0.0, max=239.0

Note that if the filter size is odd (41), the results are correct, no matter if the dtype is float or uint16. In case the filter size is even (40), the result is totally bogus for dtype uint16, but looks better for dtype float.

The explanation for the strange output image is that the result of the filter calculation original_image - maximum_filter(minimum_filter(original_image)) yields pixels with negative intensity values in the case where the filter size is even. If the dtype of the original image is unsigned (which it typically is in sparrow), underflow will then occur and instead of zero values, very large values will be stored in the result array instead. If the dtype is signed (like float) the result looks better because negative values do not wrap around.

Note that for odd filter sizes, the output of minmax_filter() is mathematically guaranteed to be non-negative. This is what we observe in the output.

Specifically, in the case of an odd filter size, when calculating the output of the maximum_filter for pixel (i,j), we take that maximum over a window of values centered on (i,j). But for each of thése values the window that was used for calculating them during the minimum_filter step, also overlapped with pixel (i,j). So all values in the window over which we take the maximum, are smaller or equal to the original pixel value. So when subtracting this from the original pixel intensity, we are guaranteed to get a non-negative result.

For even filter sizes, the problem is that the filter windows cannot be centered symmetrically around the output pixel (using integer pixel coordinates), and (in ndimage) the window extends one pixel more up and to the left, than right and to the bottom. This asymmetry in the filter windows geometry implies that the maximum step will take a maximum over values whose window, during the minimum_filter calculation, did not include the pixel (i,j) and hence might be larger than (i,j), breaking the guarantee for a non-negative output.

Conclusion, we should probably just forbid even window sizes. Since our typical window sizes are relatively large anyway, a one pixel bigger or smaller window size makes little difference.

SilverViking commented 8 months ago

This is an even simpler test case:

import numpy as np
from scipy import ndimage

def minmax_filter(img, size):
    print(f'Filter of size={size}')

    print('Original image I:')
    print(img)

    result = ndimage.minimum_filter(img, size=size)
    print('minimum_filter(I):')
    print(result)

    result = ndimage.maximum_filter(result, size=size)
    print('maximum_filter(minimum_filter(I)):')
    print(result)

    result = img - result
    print('I - maximum_filter(minimum_filter(I)):')
    print(result)

a = 100
b = 40
img = np.array([[a,a,a,a,a,a],
                [a,a,a,a,a,a],
                [a,a,a,a,a,a],
                [a,a,a,a,a,a],
                [b,b,b,b,b,b],
                [b,b,b,b,b,b],
                [b,b,b,b,b,b],
                [b,b,b,b,b,b]], dtype=int)  #  using dtype signed int to see incorrect negative values

filter_size = 4
minmax_filter(img, filter_size)  # some output values < 0, bad

filter_size = 5
minmax_filter(img, filter_size)  # all output values >= 0, correct

With output:

Filter of size=4

Original image I:
[[100 100 100 100 100 100]
 [100 100 100 100 100 100]
 [100 100 100 100 100 100]
 [100 100 100 100 100 100]
 [ 40  40  40  40  40  40]
 [ 40  40  40  40  40  40]
 [ 40  40  40  40  40  40]
 [ 40  40  40  40  40  40]]

minimum_filter(I):
[[100 100 100 100 100 100]
 [100 100 100 100 100 100]
 [100 100 100 100 100 100]
 [ 40  40  40  40  40  40]
 [ 40  40  40  40  40  40]
 [ 40  40  40  40  40  40]
 [ 40  40  40  40  40  40]
 [ 40  40  40  40  40  40]]

maximum_filter(minimum_filter(I)):
[[100 100 100 100 100 100]
 [100 100 100 100 100 100]
 [100 100 100 100 100 100]
 [100 100 100 100 100 100]
 [100 100 100 100 100 100]
 [ 40  40  40  40  40  40]
 [ 40  40  40  40  40  40]
 [ 40  40  40  40  40  40]]

I - maximum_filter(minimum_filter(I)):
[[  0   0   0   0   0   0]
 [  0   0   0   0   0   0]
 [  0   0   0   0   0   0]
 [  0   0   0   0   0   0]
 [-60 -60 -60 -60 -60 -60]    <= unexpected negative values
 [  0   0   0   0   0   0]
 [  0   0   0   0   0   0]
 [  0   0   0   0   0   0]]

Filter of size=5
Original image I:
[[100 100 100 100 100 100]
 [100 100 100 100 100 100]
 [100 100 100 100 100 100]
 [100 100 100 100 100 100]
 [ 40  40  40  40  40  40]
 [ 40  40  40  40  40  40]
 [ 40  40  40  40  40  40]
 [ 40  40  40  40  40  40]]

minimum_filter(I):
[[100 100 100 100 100 100]
 [100 100 100 100 100 100]
 [ 40  40  40  40  40  40]
 [ 40  40  40  40  40  40]
 [ 40  40  40  40  40  40]
 [ 40  40  40  40  40  40]
 [ 40  40  40  40  40  40]
 [ 40  40  40  40  40  40]]

maximum_filter(minimum_filter(I)):
[[100 100 100 100 100 100]
 [100 100 100 100 100 100]
 [100 100 100 100 100 100]
 [100 100 100 100 100 100]
 [ 40  40  40  40  40  40]
 [ 40  40  40  40  40  40]
 [ 40  40  40  40  40  40]
 [ 40  40  40  40  40  40]]

I - maximum_filter(minimum_filter(I)):
[[0 0 0 0 0 0]
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]  <= no negative values, correct!
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]]

ArneDefauw commented 8 months ago

Ok, I will merge a 'fix' for this in main https://github.com/saeyslab/napari-sparrow/commit/33cf6a1e195b5a8c96e2242a4a102319c8f4cd6b

ArneDefauw commented 8 months ago

Fixed by https://github.com/saeyslab/napari-sparrow/pull/160

saeyslab / napari-sparrow

Weird behaviour: size_min_max_filter #159