FID increases for larger training datasets

mit-han-lab / data-efficient-gans

[NeurIPS 2020] Differentiable Augmentation for Data-Efficient GAN Training

https://arxiv.org/abs/2006.10738

BSD 2-Clause "Simplified" License

1.27k stars 175 forks source link

FID increases for larger training datasets #55

Closed slala2121 closed 3 years ago

slala2121 commented 3 years ago

I trained a generator using 3.8K images on 256x256 following the settings described and trained another generator using 750 images with the same settings but with regularization increased from 1 to 10 as in the paper.

While the training FID-5K score is lower when training on 3.8K vs 750 (both training FID curves are stable), I find that the the FID score computed using the generator trained on 3.8K gives much higher FID scores compared to that attained under 750.

I am using the same code to compute the FID scores so it is unclear to me why this is the case. Have you encountered this issue?

zsyzzsoft commented 3 years ago

"While the training FID-5K score is lower when training on 3.8K vs 750" What does this sentence mean? Could you give more details?

slala2121 commented 3 years ago

When training on 3.8K images, the fid-5k score monitored through training converges to around 8.0. When training on 750 images, the fid-5k score monitored through training converges to around 33.

When I evaluate the fid-50k score for the generative model trained on 750 images, it has an fid-50k score around 21. When I evaluate the fid-50k score the generative model trained on 3.8K images, it has an fid-50k score around 50.

Best,

Sayeri Lala PhD candidate | Electrical Engineering | Princeton University

On Fri, Mar 12, 2021 at 8:31 PM Shengyu Zhao @.***> wrote:

"While the training FID-5K score is lower when training on 3.8K vs 750" What does this sentence mean? Could you give more details?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mit-han-lab/data-efficient-gans/issues/55#issuecomment-797864854, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN3GCFSL32DL75IQ4OJNM4LTDLTCNANCNFSM4ZDDMJPA .

zsyzzsoft commented 3 years ago

This is probably caused by the inconsistency between training and testing. The FID-50k score of a model should be smaller than FID-5k of the same model. Please check whether your data is properly processed during testing.

slala2121 commented 3 years ago

By inconsistency, do you mean with respect to image preprocessing?

Best,

Sayeri Lala PhD candidate | Electrical Engineering | Princeton University

On Fri, Mar 12, 2021 at 11:32 PM Shengyu Zhao @.***> wrote:

This is probably caused by the inconsistency between training and testing. The FID-50k score of a model should be smaller than FID-5k of the same model.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mit-han-lab/data-efficient-gans/issues/55#issuecomment-797883527, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN3GCFQKBQLGFH2DE3WSNNDTDMIJJANCNFSM4ZDDMJPA .

slala2121 commented 3 years ago

Or are there other differences bet training and testing for the generator e.g., dropout, batch norm? Thanks.

Best,

Sayeri Lala PhD candidate | Electrical Engineering | Princeton University

On Fri, Mar 12, 2021 at 11:45 PM Sayeri Lala @.***> wrote:

By inconsistency, do you mean with respect to image preprocessing?

Best,

Sayeri Lala PhD candidate | Electrical Engineering | Princeton University

On Fri, Mar 12, 2021 at 11:32 PM Shengyu Zhao @.***> wrote:

This is probably caused by the inconsistency between training and testing. The FID-50k score of a model should be smaller than FID-5k of the same model.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mit-han-lab/data-efficient-gans/issues/55#issuecomment-797883527, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN3GCFQKBQLGFH2DE3WSNNDTDMIJJANCNFSM4ZDDMJPA .

zsyzzsoft commented 3 years ago

Most likely due to image preprocessing.

zsyzzsoft commented 3 years ago

Another guess - maybe you computed the FID statistics of real images with 750 images, but when you evaluate the model trained with 3.8k images, it still uses the cached statictics with 750 images. Maybe you can clear the cached files or rename the dataset, and then re-evaluate the model trained with 3.8k images.

slala2121 commented 3 years ago

Thanks, I'll look into it.

Best,

Sayeri Lala PhD candidate | Electrical Engineering | Princeton University

On Sat, Mar 13, 2021 at 3:37 AM Shengyu Zhao @.***> wrote:

Another guess - maybe you computed the FID statistics of real images with 750 images, but when you evaluate the model trained with 3.8k images, it still uses the cached statictics with 750 images. Maybe you can clear the cached files or rename the dataset, and then re-evaluate the model trained with 3.8k images.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mit-han-lab/data-efficient-gans/issues/55#issuecomment-798210345, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN3GCFSZL5HQZ5Y745347GLTDNFABANCNFSM4ZDDMJPA .

slala2121 commented 3 years ago

I made some fixes so now the results match. Thanks for your thoughts!