pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
https://pytorch.org
Other
84.45k stars 22.74k forks source link

Feature request: SSIM/MS-SSIM #6934

Open alwynmathew opened 6 years ago

alwynmathew commented 6 years ago

The Structural Similarity Index (SSIM) is generally considered to be a milestone in the recent history of Image Quality Assessment (IQA).

It would be nice to see in-build SSIM/MS-SSIM function in pytorch.

cc @fmassa @vfdev-5

macaodha commented 6 years ago

I haven't explored it but it looks like there is a Tensorflow SSIM implementation as part of their code base.

karandwivedi42 commented 6 years ago

@zou3519 I can take this up if there is interest in adding this to pytorch.

jcowles commented 6 years ago

I'm not claiming it's correct, but there is also a PyTorch implementation here: https://github.com/Po-Hsun-Su/pytorch-ssim

Auth0rM0rgan commented 5 years ago

Hey @zou3519 ,

I think you should change the label of this feature request from 'todo' to 'medium priority task' since the SSIM loss is very helpful in GAN models.

cardoso-neto commented 5 years ago

@Auth0rM0rgan Most image-to-image translation tasks would benefit from this. Please take this up, @karandwivedi42.

veritas9872 commented 5 years ago

Hey @zou3519 ,

I think you should change the label of this feature request from 'todo' to 'medium priority task' since the SSIM loss is very helpful in GAN models.

@karandwivedi42 I also think that it makes for a great loss function too.

Many tasks, such as image reconstruction and image super-resolution, use SSIM and MS-SSIM as a loss function, and this helps enormously with the output quality.

musikisomorphie commented 5 years ago

@soumith, I noticed that you were assigned to several issues, can I help you with this one? I work on image restoration and GAN, so I used this metric a lot. I could give it a try.

veritas9872 commented 5 years ago

@musikisomorphie I think it would be great if there were a properly tested SSIM and MS-SSIM in Pytorch.

veritas9872 commented 5 years ago

I have found that it is rather inconvenient to use the SSIM in skimage for Pytorch Tensors since they are in NCHW order, instead of the HWC order that skimage expects (same for MS-SSIM in skvideo).

Also, their use as loss functions can significantly enhance image quality in tasks such as image reconstruction, denoising, and super-resolution.

See here for the original paper on the merits of using SSIM and MS-SSIM as loss functions.

Also see here for their use in image generation.

musikisomorphie commented 5 years ago

@veritas9872, @soumith did not answer me back, I guess he is busy. Anyway, I will check those articles you mentioned and implement SSIM/MS-SSIM in the following weeks. I will let you if I make some progress.

mrTsjolder commented 5 years ago

There's also this paper, which seems to have been published during the same time, but did not make it to publication.

soumith commented 5 years ago

it was assigned to me so that I can review SSIM, and figure out whether we should add it into PyTorch core or not. I haven't gotten to it yet, will do soon.

musikisomorphie commented 5 years ago

@soumith, I started implementing (MS)SSIM in the torch/nn/functional.py since last weekend. Because it is not so trivial to write it in C++/Cuda, I implemented them simply by using torch operations. Should I keep going or let you take over?

veritas9872 commented 5 years ago

@soumith @musikisomorphie I would like to ask for the window size, sigma, K1, and K2 parameters be exposed for users to control. Those parameters (especially window size) are important for what exactly SSIM is measuring. The paper I uploaded illustrates the effects of SSIM with different window sizes very nicely.

veritas9872 commented 5 years ago

@musikisomorphie There is already an implementation using Pytorch code here. I think it implements both SSIM and MS-SSIM. However, there are no tests to verify it is correct. Comparing results with skimage might be helpful. Also, pylint gave me a lot of errors when I downloaded it.

musikisomorphie commented 5 years ago

@veritas9872, thank you for mentioning that repo. Actually, my implementation is inspired by it, together with some changes so that the code can reproduce the results from the original matlab code. I run some tests myself for both ssim and ms-ssim, the results produced by my code shows 1-2% difference compared to the matlab code. I will open a PR soon.

veritas9872 commented 5 years ago

@musikisomorphie Thank you for your work on the project. I am greatly looking forward to using SSIM on Pytorch for my projects, which mostly involve image reconstruction for now.

Although I did mention this in my previous comment, I would like to ask for all the parameters in SSIM and MS-SSIM to be made tunable by the programmers, with most of the usual values set as defaults.

In the paper that I mentioned previously, different versions of SSIM (e.g. with different window sizes) have very different effects on outputs. I would be most grateful if they were made tunable from the beginning, as their use as a loss function is very helpful for image reconstruction tasks.

Shreeyak commented 5 years ago

@musikisomorphie Thanks for working on it! @soumith SSIM is also very useful on other image tasks like predicting depth. Having a verified implementation inside pytorch would be very useful indeed. Tensorflow has an SSIM loss integrated, which people have used in tasks of depth estimation - makes a big difference.

WenmuZhou commented 5 years ago

mark

Auth0rM0rgan commented 5 years ago

@musikisomorphie, Thanks for working on SSIM. I am just curious to know, when are you going to open PR and public your code?!

musikisomorphie commented 5 years ago

@Auth0rM0rgan The PR is opened, https://github.com/pytorch/pytorch/pull/21256. Unfortunately, I may not keep maintaining it due to my busy schedule. My code can be a good starting point, you can give it a try.

veritas9872 commented 5 years ago

@musikisomorphie Thank you for your work. I have been able to create my own working version of SSIM and MS-SSIM thanks to you. Although, I am not at all sure whether my version works on other systems.

Auth0rM0rgan commented 5 years ago

Hey @veritas9872, could you share your version?!

Thanks

veritas9872 commented 5 years ago

ssim.txt Please note that I made no effort whatsoever for compatibility on any other system than my own. You will need to remove f-strings and test if this works on multi-GPU systems. Also, I am still trying to remove small data transfers each time this is used. In GPU programming, it is very important to remove data transfers between CPU and GPU, which this code still does not do very well.

Auth0rM0rgan commented 5 years ago

Hey @MKFMIKU,

I think you've forgotten to set the channel before calling _fspecial_gaussian() inside the ssim_loss and ms_ssim_loss functions since _fspecial_gaussian() need number of channels.

Best,

MKFMIKU commented 5 years ago

@Auth0rM0rgan My bad. Fixed it in https://github.com/pytorch/pytorch/pull/22289/commits/037bff164dd7936eccff5fd218fc404dd006be10

Auth0rM0rgan commented 5 years ago

Hey @MKFMIKU,

Thanks for the work. I would like to ask for one more feature to normalize the values. The current implementation seems to require to have values between [0, 1] and [0, 255] and if the values are negative below error will raise so if you normalize the value before feeding them to ms_ssim loss function would be great.

/pytorch/aten/src/THC/THCIntegerDivider.cuh:82: IntDivider::IntDivider(unsigned int): Assertion `divisor >= 1 && divisor <= (2147483647)' failed. Aborted (core dumped)

Best

veritas9872 commented 5 years ago

@Auth0rM0rgan I think that is a numerical stability issue. As far as I know, there is no requirement that the values be between [0,1] or [0, 255] in the current implementation. However, I have also found that excessively small values can result in NaN values being returned. However, this is a problem with all Pytorch calculations, not just this function.

Auth0rM0rgan commented 5 years ago

Hey @veritas9872,

Regard to NaN values, SSIM value can be smaller than zero so taking its power with a fractional number yields NaNs.It can avoid it by normalizing the values. that's the reason why I asked for normalization features.

regard to a numerical stability issue, what is your suggestion to solve this issue? because every time I try to use the SSIM, I'm facing with that error.

veritas9872 commented 5 years ago

I would simply multiply the inputs by a big number. The SSIM is invariant to the value scale, but not everyone needs standardization, which would make the process slower than it already is.

Auth0rM0rgan commented 5 years ago

Hey @veritas9872, Thanks. Do have any idea how can I fix below error? I'm not able to use SSIM loss because of this error.

/pytorch/aten/src/THC/THCIntegerDivider.cuh:82: IntDivider::IntDivider(unsigned int): Assertion `divisor >= 1 && divisor <= (2147483647)' failed. Aborted (core dumped)

veritas9872 commented 5 years ago

I have never faced this error so I cannot say for sure. However, I would recommend copying the current implementation from here. It seems to work well. I would recommend designing a new function to implement (1 - SSIM) that caches the kernel and weights, since the current functional implementation copies the kernel and weights every time.

Auth0rM0rgan commented 5 years ago

@veritas9872, I'm not getting an error when I am using SSIM but when I used MS_SSIM, I faced that error. Have you tried MS_SSIM?

veritas9872 commented 5 years ago

@Auth0rM0rgan No I have not. Could you try using anomaly detection? This function here might be useful.

veritas9872 commented 5 years ago

@Auth0rM0rgan @MKFMIKU After reading the implementation, I think that it would be better simply to dynamically expand the kernel than specify the channel number beforehand. I have explained my request in more detail in the PR.

veritas9872 commented 5 years ago

@MKFMIKU I think that performing the SSIM operation with two 1D kernels, as suggested in #22289, would be an excellent idea. The current implementation takes a lot of time since there are five convolution operations that need to be done just for SSIM. For MS-SSIM, this figure increases to 20, I believe.

veritas9872 commented 5 years ago

For anyone trying to implement SSIM on their own, this is the link for the official skimage implementation for SSIM. There is also an skvideo implementation of SSIM here. Also, skvideo has an implementation of MS-SSIM here. There are several factors to consider when converting this to Pytorch, but they might prove to be useful references for anyone who wants to try their own hand at writing SSIM and MS-SSIM.

veritas9872 commented 5 years ago

ssim and ms-ssim with 1D kernel.txt After going through the code by @VainF, I have written a version that might be usable. I have tested it against TensorFlow's SSIM and they give very close values. However, there are significant, but still small, differences with the values given by skimage's compare_ssim. The code might need some extra work for backward-compatibility. Also, the documentation is not great and some unnecessary parts are there to keep pylint happy. They might need to be removed. I am especially worried about the effect of self.one in the object oriented versions. I do not know what they might do to the gradients in backprop, though it probably doesn't matter.

I have found that my implementation is very fast. Just be careful to always send it to the device the data is using. The cached buffers require this, unlike most loss functions.

veritas9872 commented 5 years ago

I also referenced Tensorflow's SSIM function as well as the current version being developed by @MKFMIKU. I have also found a typo in the comments. The code does not implement "unbiased covariance", not "unbiased conv". However, neither does Tensorflow.

John1231983 commented 5 years ago

@soumith : Great if SSIM supports for 5D tensor. Thanks

snk4tr commented 4 years ago

Any news on SSIM/MS-SSIM integration?

gchanan commented 4 years ago

I asked @fmassa his opinion on this and:

about this issue, it is indeed a metric which is widely used on image quality / compression works. It's not something we differentiate over though, it's kind of the Acc@1 metric for classification.

It could be a fit for torchvision, but we don't have a "metrics" package there yet, so for now I'd say neither.

SSIM is implemented in kornia https://github.com/kornia/kornia/blob/master/kornia/losses/ssim.py , and it's a few lines of code

So, to summarize: We'd accept a PR that implements this, but it's (probably) blocked on https://github.com/pytorch/pytorch/issues/22439.

francois-rozet commented 3 years ago

I have implemented a package of fast IQA metrics (including SSIM and MS-SSIM) in PyTorch: https://github.com/francois-rozet/piqa Any feedback would be appreciated.

Laktus commented 8 months ago

Why is this still open?