NVlabs / SPADE

Semantic Image Synthesis with SPADE
https://nvlabs.github.io/SPADE/
Other
7.61k stars 984 forks source link

Replicating COCO model training #51

Open matttrent opened 5 years ago

matttrent commented 5 years ago

I'm trying to train SPADE on the COCO dataset, and reproduce the FID numbers. I've already managed to replicate the FID score for the model trained with ADE20K, and just want to confirm a few of the settings. It's a bit unclear from Appendix A how the COCO settings differ from the ADE20K ones. So:

  1. I see the number of epochs is 100. For COCO, is the learning rate schedule constant for all 100, or is it constant for the first 50 and for the second 50 it linearly decays to zero?
  2. --batchSize is still 32 (assuming 8 GPUs)?
  3. --coco_no_portraits should not be set (False)?
  4. --use_vae should not be set (False)?

Any other setting differences I should be aware of?

Thanks again for all the helpful replies.

Cold-Winter commented 5 years ago

Can you please give me some kinds of source code for calculating the FID numbers. I have found one here https://github.com/NVlabs/stylegan/blob/master/metrics/frechet_inception_distance.py. Do you use the same one.

matttrent commented 5 years ago

I used the code from this repo: https://github.com/mseitzer/pytorch-fid

The numbers it produced for both ADE20 and COCO Stuff match what's published in the paper.

Cold-Winter commented 5 years ago

Thanks for your response. I test this code on cityscapes and I get 63 FID which is much better than the FID score reported in their paper which is 71. Do you reproduce these results? When we test with FID score, should we rescale the image as 299*299 which is the input of the inception net?

ShihuaHuang95 commented 5 years ago

@matttrent When you calculate the FID score, you take the training samples or the validation samples as the real images? I am confused as I got 32.12 or 39.65 when I took the resized training samples or the resized validation samples as the input, respectively.