Open purpleman-ljl opened 5 years ago
I think the original paper means the input of the generator is the image without background, which is the fgA. The model figure on github is vision no.3, but the code is for vision no.1. Actually, The main point of the two pictures are the same
Thank you for your implementation of Unsupervised Attention-guided Image-to-Image Translation. After learning your code, I found some difference between yours and the description of the original paper. your code: attnMapA = toZeroThreshold(AttnA(realA)) fgA = attnMapA realA bgA = (1 - attnMapA) realA genB = genA2B(fgA) fakeB = (attnMapA genB) + bgA but in original paper, After feeding the input image to the generator, Apply the learned mask to the generated image using an element-wise product ‘ ’, and then add the background using the inverse of the mask applied to the input image. attnMapA = toZeroThreshold(AttnA(realA)) fgA = attnMapA realA bgA = (1 - attnMapA) realA genB = genA2B(realA) fakeB = (attnMapA genB) + bgA According to the original paper, genB = genA2B(realA) rather than genB = genA2B(fgA) , and then fakeB = (attnMapA genB) + bgA Could you tall me why do you implement the code in this way? Finally, please forgive me for my bad English if it annoys you. =。=!