bioinf-jku / TTUR

Two time-scale update rule for training GANs
Apache License 2.0
853 stars 174 forks source link

Where are images resized? #36

Open unixpickle opened 4 years ago

unixpickle commented 4 years ago

It appears that the code never resizes images to be the correct 299x299 for the inception model. Is it the case that all of the results on 64x64 images are obtained by feeding smaller images into the convolutional network and simply assuming that the outputs are meaningful? Or is there a resize somewhere I'm not seeing?

I also observed that resolution mattered immensely when comparing to the precomputed npz matrices in this repository. In particular, if the images were not 64x64, the FID was extremely high, so I'm assuming those npz matrices were computed by feeding 64x64 images directly into the inception graph.

bahjat-kawar commented 3 years ago

It looks like the model resizes the images to 299x299.

I am also facing the issue of extremely high FID when comparing to the precomputed npz matrices. Did you find a solution for this? Is there perhaps a different set of precomputed statistics for comparison on other resolutions (128, 256, 512)?

mhex commented 3 years ago

Hi, other datasets, this includes different resolutions, produce different activations in the coding layer (e.g. the last pooling layer in the inception network) and this implies different statistics, therefore you need to precompute the reference statistics for this dataset for yourself. HTH