Closed zosel260 closed 2 years ago
Hi, thanks for your interest. To answer your questions,
The double_size case was trained with the mask where the hole region is filled with the value 0.5, and non-hole regions with 1.0. There is no special reason behind this choice.
We follow the implementation of "Free-Form Image Inpainting With Gated Convolution - ICCV19". They spare the ones-channel for the additional user-sketch guidance, but this does not apply to our work, and thus would have minor effect with or without it.
Dear authors, Thank you for sharing your codes. I'm very impressed from your results. While looking over your codes, there are 2 things I can't understand. Please look over these questions.
1) in demo_vi.py, line 167 if opt.double_size: prevmask = masks_[:,:,2]*0.5 => Why do you multiply 0.5 to mask for the double_size case?
2) in vinet.py, line 164
enc_input = torch.cat([masked_img, ones, ones*mask], dim=1)
For the encoder's inputs, Why do you insert ones channel ? I thinks ones channels has no information for inpainting. Why do you multiply ones channel to the mask? Is the ones * mask same with just the original mask?
I appreciate in advance.