Open mixuala opened 4 years ago
As far as I know, torchvision's VGG assumes RGB image input of pixel values [0,1] in contrast to the original VGG weights which assumes BGR image input with pixel values from [0,255].
Did that work out for you?
Anyway, someone has already reported this issue to me. #10
I have uploaded the VGG16 weights here: https://drive.google.com/file/d/1a0sFcNEvmIy21PE0yp7tzJuU0vhhV0Ln/view?usp=sharing
I will also update the readme as well as the notebooks with the appropriate link.
Thank you for reporting.
It seems to work, here's a shot of my training just before the end (below) using the pytorch weights as mentioned
What I'm confused about is how your network successfully learns without clipping the Transformer
output before calculating the VGG
losses. I rebuilt the same network in tensorflow, but it doesn't learn unless I do the following transforms to the output of the Transformer
VGG mean center
values and domain=(0.,1.)
I'm still working on it...Interesting! Looks like it somehow worked, though I never had a result with artifacts like yours in upper right corner.
I did add the VGG means
content_features = VGG(content_batch.add(imagenet_neg_mean))
generated_features = VGG(generated_batch.add(imagenet_neg_mean))
And yeah, I am also equally confused that I did not need to use a Tanh as output layer. In fact, Tanh produced really bad looking images . Search for Ditching
in my README
got a
FORBIDDEN
response for the AWS file. switched to the weights file which is now included with torchvision