scikit-image / scikit-image

Image processing in Python
https://scikit-image.org
Other
6.02k stars 2.22k forks source link

skimage.metrics.structural_similarity() vs MATLAB #5192

Open zakajd opened 3 years ago

zakajd commented 3 years ago

Description

They're 2 MATLAB implementations of SSIM from original authors, which can be found here. First one ssim_index.m matches current implementation in skimage.metrics . Second one ssim.m differs by image preprocessing. Before computation of SSIM authors propose to perform average pooling with factor F, where F = max(1, round(min(H, W)/256)).

Those two implementation are equivalent for small images (with min size < 384), so why matter? SSIM and other full-reference quality assessment metrics (PSNR) are often used as "perceptual simularity estimators", meaning that you get bigger values for more "similar" images. Quality of metrics predictions is benchmarked on a set of human labeled images (Mean Opinion Scores databases) by Spearman Rank Order Correlation (SRCC) and Kendal Rank Order Correlation (KRCC). Higher correlation means that metric predictions of "distance" between images is consistent with human judgements.

I've run test to compare performance on TID2013 database which is often used in image quality assessment papers. Here is the results:

structural_similarity() SRCC 0.5544,  KRCC 0.3883
structural_similarity() with downsampling SRCC 0.7201,  KRCC 0.5271
peak_signal_noise_ratio()  SRCC 0.6869,  KRCC 0.4958

So current code measures "perceptual simularity" even worse than simple MSE between images. 😞 Exact reasons for such behaviour is not the scope of this issue

Way to reproduce

Code I've used for computation is a bit messy. It's based on https://github.com/photosynthesis-team/piq/blob/feature/benchmark/tests/results_benchmark.py with additional wrapper for skimage to support tensors as inputs.

How to fix

Simple way to match the performance is add 4 lines of code before beginning of SSIM computation https://github.com/scikit-image/scikit-image/blob/8acad22ff31d44b17651003db49791640d9b0b41/skimage/metrics/_structural_similarity.py#L187-L189

F = max(1, round(min(H, W)/256))
if F > 1:
    im1  = skimage.measure.block_reduce(im1, (F, F), np.mean)
    im2  = skimage.measure.block_reduce(im2, (F, F), np.mean)

Version information

3.6.9 (default, Oct  8 2020, 12:12:24) 
[GCC 8.4.0]
Linux-4.15.0-132-generic-x86_64-with-Ubuntu-18.04-bionic
scikit-image version: 0.16.2
numpy version: 1.19.4

I can open PR if you interested

grlee77 commented 3 years ago

Thanks for raising the issue @zakajd. This is interesting and seems worth having.

To maintain backwards compatibility, I would suggest implementing it via a new keyword-only argument, automatic_downsampling (or auto_downsample), that defaults to False.