leehomyc / Faster-High-Res-Neural-Inpainting

High-Resolution Image Inpainting using Multi-Scale Neural Patch Synthesis
http://www.harryyang.org/inpainting
MIT License
1.3k stars 213 forks source link

About the texture network code #14

Closed stylite5 closed 7 years ago

stylite5 commented 7 years ago

Hello! I have read your article and the code. There are some questions i would like to consult you: 1.Does the content loss in the texture network code file (transfer_CNNMRF_wrapper) not mean the content loss of the content network? 2.What exactly way does the patchmatch use to search for the nearest patch? I printed target_feature_map, found that 18 kinds of convolution image were divided into 1933+493=2426,patches totally, and then sampled these patches, the number of sampling patches is equal to the previous each convolution image is divided into, I didn't see the point of distinguishing the hole inside and the hole outside in the code, and is it different from the article? 3. Does the patchmatch look for the closest patch in the image input's feature map of the middle layer at each scale? The input is the image being hollowed out . Does the result of each iteration return as the image input and then continue to look for the closest patch from the image input's feature map of the middle layer ? 4. I print the output of the net at the first scale . It's a convolution image of 5121616. How can change it into an optimal output image x? Does the optimal output image of the first scale x return as the image named fake of and continue to enter the texture network? If this is not the case, then where is the x to be sent to initialization the next scale? 5.I print the net and find it only use ten convolution layers different from the VGG sixteen convolution layers. And it doesn't use fully-connected layers. This is why.
Wish to receive your reply. Thank you.

leehomyc commented 7 years ago
  1. no they are not the same. The texture network actually use lbfgs and the content is only used as initialization.
  2. there is no patch match, we use the patch from the style part as a filter and do a forward pass. We then extract the largest response and take it as the nearest patch. We actually use the same thing from CNNMRF paper. They have clear descriptions about the algorithm.