Open NOlivier-Inria opened 4 years ago
@NOlivier-Inria
Would it be that 128px training require different parameters, to replicate the results ?
Yes, It is better to try several hyperparameters for different resolutions. Reducing w_hpf
makes the generator to produce diverse images while less preserving the source identity. Try lower values of w_hpf
than 1 (e.g., 0.1, 0.25, 0.5).
Thank you for your answer. I re-trained with lower w_hpf
, and got slighly better results.
w_hpf = 0.25
: FID (15.82 : 18.57), LPIPS (0.260 : 0.241)
w_hpf = 0.01
: FID (13.42 : 16.87), LPIPS (0.266 : 0.232)
Interesting that the FID is slighly better, but the LPIPS remains worse (than the 256px model, and the 50k 128px one). I suppose that the missing conv layer - compared to a 256px network - can help explain it.
I've tried to replicate your results for celeba-hq using your instructions.
First downloading the data :
Then executing the command given on this github, for 100k iterations (the only difference being
--img_size 128
):The results I am getting are quite worse than those of your pretrained model :
FID (latent : reference) at 100k: (16.80 : 19.65) instead of (13.73 : 23.84). LPIPS (latent : reference) at 100k: (0.232 : 0.228) instead of (0.452 : 0.389). Interestingly, performance is better at 50k iterations : FID of (13.49 : 19.46), LPIPS of (0.303 : 0.252).
I use pytorch 1.4.0, torchvision 0.5.0, and cudatoolkit 10.0.130, as in the dependencies install instructions
conda install -y pytorch=1.4.0 torchvision=0.5.0 cudatoolkit=10.0 -c pytorch
What could explain this behavior ? Would it be that 128px training require different parameters, to replicate the results ?