Are there any specific options to calculate real statistics and FID?

steechee commented 3 years ago

Thank you for sharing your research. Are there any specific options to calculate statistics from the dataset and measure FID values? I tried to calculate statistics from data (python get_real_stat.py --dataroot database/horse --output_path horse.npz --dataset_mode single --gpu_ids 0) but calculated statistics and provided real statistics are different. I’ve changed dataset_mode and preprocess options but the difference remains. And I try to get FID with the calculated statistics and results folder(python pytorch-fid-master/src/pytorch_fid/fid_score.py horse2zebra_B.npz fakezebra_results_folder_dir --device cuda:1) but FID values are not reproducible.

lmxyy commented 3 years ago

We use all the images (both train and test sets) of horse2zebra to compute the real statistics, did you do that? Besides, how much is the difference?

steechee commented 3 years ago

Yes, I followed the direction in the paper. I started from full Cyclegan horse2zebra setting. For real set, I used two statistics. One is a given horse2zebra_B.npz and another is a zebra.npz that I calculated with 1334 train+140 test zebra images. (python get_real_stat.py --dataroot ... --output_path ... --dataset_mode single --gpu_ids ... ) For fake set, I tested whole 120 real horse test images to got 120 fake zebra. Also I utilized two test results, one is a result of the downloaded full pre-trained model and another is a result of the trained full model. (using default setting in scripts/cycle_gan/horse2zebra/train_full.sh) Then I used pytorch-fid(https://github.com/mseitzer/pytorch-fid) to compute FID. (python pytorch-fid-master/src/pytorch_fid/fid_score.py horse2zebra_B.npz fakezebra_results_folder_dir --device cuda:1) The FID values that I got were here. Nothing is exactly matched with 65.75 in the webpage or 61.53 in the paper.

Furthermore, I'm interested in the FID values of the pre-trained origianl Cyclegan model too. Some GAN compression papers are sharing FID values from the "Co-Evolutionary Compression for unpaired image Translation"(horse2zebra 74.04, zebra2horse 148.81, summer2winter 79.12, winter2summer 73.31). And you report the pre-trained model's FID value is 71.84 in Table 8. Is it because of the initial point or are there some unmentioned common specific settings for training a cyclegan and FID calculation? Or small difference in FID is negligible?

lmxyy commented 3 years ago

I see. This is because I've retrained the full model to ensure the reproducibility of the codebase and I only released the retrained full model. Besides, the released statistic is computed with the old code, which may have some slight differences from the code in the repo, so the FID may have some minor differences (e.g., within the range of +-3). I think your results are okay. FID is not a stable metric and it is common to have a variance of range 3. Either your recalculated statistic and our released statistic works.

mit-han-lab / gan-compression

Are there any specific options to calculate real statistics and FID? #81