Open deeptibhegde opened 5 years ago
Edit: "restoration" not "super-resolution"
I couldn‘t run the code of "restoration".I had this problem when I run "restoration":name 'Concat' is not defined.I don't know why the "concat" is not defined(the "concat" from the "skip").Do you know how to fix it? Thank you.
@fengyayuan I did not face a similar issue, however did you ensure that you have a stable version of PyTorch installed? Were you able to run the other codes? Did you face a similar problem using UNet architecture?
@fengyayuan I did not face a similar issue, however did you ensure that you have a stable version of PyTorch installed? Were you able to run the other codes? Did you face a similar problem using UNet architecture?
Thanks for your reply,i can use the UNet in the "super-resolution",but couldn't use skip in the "restoration".I'm not sure if it's my pytorch's instability.
@d-b-h I find the same issue in the in-painting code.
In the code, the MSE is calculated by comparing out*mask_var
and img_var*mask_var
. But if we do that, we assume we know the noise/mask and this seems somehow cheating.
I will try adding some regularization to see whether it can get rid of the mask. Have you had more tests or ideas since then?
Thanks
@AlexanderZhujiageng I haven't been successful in getting decent results with unknown masks. I read the paper again, but it is not clear to me if this aspect is inherent to the method or if I am just missing something!
Did the regularisation work?
@d-b-h The regularization didn't work.
I have double checked the paper. In the paper the loss function is:
It assumes we know the what is the original image is.
It seems that we have to use large dataset to train the network, like we usually do, then this can work with unknown masks.
I am late to the party, but here are my inputs:
You do require a mask to indicate which parts are missing in the given damaged image. Blind inpainting (i.e. inpainting without the masks) is a much difficult problem even for the traditional approaches while learning a model. The method in this paper is not defined for "blind inpainting".
The reason you get to the right image during the inference stage is that the model is now acting as a "prior" (hence, the name of the paper! :) ). It learns this prior distribution by looking (read reconstructing) other available pixels. During the inference stage, it predicts the missing pixels. Note that the inference stage is the same for all image restoration (denoising, super-resolution, inpainting, etc.) tasks shown in the paper i.e. x^*=f_\theta(z)
, where z is the fixed input, \theta is the optimal weight and f_\theta is the conv. network.
Hope this helps.
@DmitryUlyanov how do you explain the mask in the loss function?
it seems intuitive to me that the mask are 1s for missing pixels because we want to predict those pixels, so the loss function should address that. for those valid pixels, the model do not need to be able to recover anything, thus the loss function does not need to pay attention, so it seem intuitive to filter them out by multiplying 0.
I dig into your implementation, it seems that you used mask in an opposite way as i described above. I could not find much more detail about it in the paper, could you please explain the logic?
Thanks
@abhishekaich27 how do you explain the mask in the loss function?
it seems intuitive to me that the mask are 1s for missing pixels because we want to predict those pixels, so the loss function should address that. for those valid pixels, the model do not need to be able to recover anything, thus the loss function does not need to pay attention, so it seem intuitive to filter them out by multiplying 0.
I dig into your implementation, it seems that you used mask in an opposite way as i described above. I could not find much more detail about it in the paper, could you please explain the logic?
Thanks
This is not my implementation. The author is @DmitryUlyanov !
@abhishekaich27 how do you explain the mask in the loss function? it seems intuitive to me that the mask are 1s for missing pixels because we want to predict those pixels, so the loss function should address that. for those valid pixels, the model do not need to be able to recover anything, thus the loss function does not need to pay attention, so it seem intuitive to filter them out by multiplying 0. I dig into your implementation, it seems that you used mask in an opposite way as i described above. I could not find much more detail about it in the paper, could you please explain the logic? Thanks
This is not my implementation. The author is @DmitryUlyanov !
oh, sorry for miss reference. thanks for response
After reading the example code, I thought it was unable to implement inpainting via this project because the author just operate the unet like GAN. In the example,it has to get the original image not corruputed as base tensor of loss function to make network generate clear image. In real word, we only have corrupted image and mask not original image which are useless to this procedure. To sum up, I conclude that this project is useless under inpainting circumstance.
A unrealistic paper.
After reading the example code, I thought it was unable to implement inpainting via this project because the author just operate the unet like GAN. In the example,it has to get the original image not corruputed as base tensor of loss function to make network generate clear image. In real word, we only have corrupted image and mask not original image which are useless to this procedure. To sum up, I conclude that this project is useless under inpainting circumstance.
can't agree more!
Hello, I have run the super-resolution and inpainting codes on the provided examples as well as my own test cases. Noise which is already present in the image i.e. not added by the mask is reproduced in the generated image. Similar results for the text mask (text already present in image not removed.)
Does this mean the inverse operation only works for masks applied to the image? Doesn't this reduce the potential for real-world application?
I apologize for any misunderstanding, but some clarification would be appreciated, Thank you