WonwoongCho / Generative-Inpainting-pytorch

This is a pytorch version of the Generative Image Inpainting with Contextual Attention
71 stars 14 forks source link

Problem on function "extract_patches" #2

Closed facetohard closed 5 years ago

facetohard commented 5 years ago

I found that the reimplementation of "extract_patches" may be wrong. I tested your context_attention module according to function "test_contextual_attention". But i got the different result. I changed it to x = self.padding(x) x = x.permute(0, 2, 3, 1) all_patches = x.unfold(1, kernel, stride).unfold(2, kernel, stride) and I got the same results with the tf code. May be this difference results the blur results you have mentioned in Readme

robbiebarrat commented 5 years ago

@facetohard which file did you edit? model_module.py?

If you could please let me know which file / line numbers to edit - or perhaps just put the whole new file in a gist, it would really help me out. I'm getting blurry results too and badly want this fix!!

Thank you!!

facetohard commented 5 years ago

Line 263 in model_module.py

hedlmote commented 5 years ago

I found that the reimplementation of "extract_patches" may be wrong. I tested your context_attention module according to function "test_contextual_attention". But i got the different result. I changed it to x = self.padding(x) x = x.permute(0, 2, 3, 1) all_patches = x.unfold(1, kernel, stride).unfold(2, kernel, stride) and I got the same results with the tf code. May be this difference results the blur results you have mentioned in Readme

When I check the implementation, I also find that the extract_patches is wrong. The authors directly unfold background tensors and view the extract patch as shape of weight size needed for convolution. Though running is ok, but such operation by no means is to apply cross-correlation between each background patch and the the foreground tensor.

Zealoe commented 5 years ago

can you get better results now with this fix? Thanks!

hedlmote commented 5 years ago

The discriminator and the l2_norm also acts differently with the original implementation hedl 邮箱:hedl@mail.ustc.edu.cn 签名由 网易邮箱大师 定制 在2019年04月30日 19:38,Zealoe 写道: can you get better results now with this fix? Thanks! — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

Zealoe commented 5 years ago

Do you mean that contextual attention module is right with this fix now? Thanks a lot! @hedlmote

hedlmote commented 5 years ago

yes, i think so. hedl 邮箱:hedl@mail.ustc.edu.cn 签名由 网易邮箱大师 定制 在2019年05月01日 20:41,Zealoe 写道: Do you mean that contextual attention module is right with this fix now? Thanks a lot! @hedlmote — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

baogiadoan commented 5 years ago

I found that the reimplementation of "extract_patches" may be wrong. I tested your context_attention module according to function "test_contextual_attention". But i got the different result. I changed it to x = self.padding(x) x = x.permute(0, 2, 3, 1) all_patches = x.unfold(1, kernel, stride).unfold(2, kernel, stride) and I got the same results with the tf code. May be this difference results the blur results you have mentioned in Readme

If you think it's a patch, I think you should submit a pull request for the author to update it, otherwise people will use a bug version, thanks. I'll try yours to see the result.

ShnitzelKiller commented 5 years ago

I've been getting an all gray image when running test_contextual_attention from the TF framework, even after applying the patch described here. For running test_contextual_attention, you are using just the output of the conv_transpose2d, without the last self.out(y), right?

Here are some example result comparisons with a test image (where F and B are both the same image, and mask is None): Input: testimg_large

Result with tensorflow implementation (baseline): result_tf3

Result with this framework, with the patch to fix extract_patches: result

EDIT: I finally narrowed down the bug, and it seems that the down_sample() function in utils.py is completely broken. The downsampled image loses all of the information and doesn't match the original at all. I switched it to the built in nn.functional.interpolate() and then it worked.

EDIT 2: I found another bug that was preventing things from working properly. It's the normalization of the convolution patches:

wi_normed = wi / torch.max(l2_norm(wi), escape_NaN)

I replaced it with the equivalent code in the tensorflow implementation (normalizing per kernel patch):

wi_normed = wi / torch.max(torch.sqrt((wi*wi).sum([1,2,3],keepdim=True)), escape_NaN)

And NOW it actually works. Without this change, I would still get a flat color image sometimes.

Zealoe commented 5 years ago

@ShnitzelKiller Yes, I also found this bug.

zengyh1900 commented 5 years ago

It seems that the author has no plan for this repo recently. Let' open this issue till the pull request is merged so that people can find all these bugs.

Thanks!