xthan / VITON

Code and dataset for paper "VITON: An Image-based Virtual Try-on Network"
534 stars 143 forks source link

some questions about the network input #5

Closed sanggehouzhihoujue closed 6 years ago

sanggehouzhihoujue commented 6 years ago

The input images are resized to 256*192, is it to fit the VGGnet that the height is set to 256 ? Have you tried the larger input size to train the network? Will the larger size conduct worse results? Looking forward to your reply,thanks. @xthan

xthan commented 6 years ago

Hi,

The VGGNet used for calculating the perceptual loss is fully convolutional, so you do not need to have the same input size as the VGGNet used for classification. Similarly, most of the style transfer frameworks does not have a constraint on the input image size.

sanggehouzhihoujue commented 6 years ago

@xthan Thank you. But when I widen the size to test the network with your model, I get a even worse result.

xthan commented 6 years ago

Yes. If you want to test on larger images, you need to train on larger images. This is because although VGGNet does not require the image size to be constant, the encoder-decoder is trained for a specific image size.

sanggehouzhihoujue commented 6 years ago

@xthan Ok,thank you. I will try.