Open zakajd opened 3 years ago
According to Tensorflow's official SSIM code https://www.tensorflow.org/api_docs/python/tf/image/ssim
, there isn't the average pooling when image size > 256. So I wonder whether the SSIM proposed here https://github.com/photosynthesis-team/piq
should be adopted especially when using SSIM as a metric.
@lijx10 Implementation in PIQ is identical to TF with 1e-4 with flag ‘downsample=False’. Image quality assessment metrics are evaluated not on basis of similarity to TensorFlow code, but on basis of prediction quality and correlation with human judgments.
So the issue with TF is that it poorly measures “perceptual distance” between 2 images and thus it’s better not to use it as a guidance for model optimisation or model selection.
Do you mean that the pytorch-ssim
is identitcal with TF official SSIM, and both of them are lack of the downsample implementation?
Yes, that’s true. Please see link to skimage issue where I described this in greater details
Hi This code contains the same error as
skimage
, andpytorch-ssim
you can read full description here: https://github.com/scikit-image/scikit-image/issues/5192Shortly, when used for estimation of perceptual quality, authors of original paper proposed to downsample images first to make SSIM focus on major differences between reference and distorted inputs.
So what? If you are using this implementation as a loss function for CNN, you're likely leading it in the wrong direction.
Alternatives You can find correct implementation of SSIM, MS-SSIM and some other metrics here: https://github.com/photosynthesis-team/piq