tobran / DF-GAN

[CVPR2022 oral] A Simple and Effective Baseline for Text-to-Image Synthesis
Other
297 stars 67 forks source link

关于 DFGAN 的若干问题 #3

Closed kuailefeifei closed 4 years ago

kuailefeifei commented 4 years ago

你好,

首先感谢你的工作。但关于代码我有一些问题:

  1. DFBlock 中的激活函数 论文中展示的 DFBlock 激活函数使用的是 ReLU,提供的代码中却是 LeakyReLU

  2. 关于提供的预训练的模型 使用你们提供的预训练模型测出来的 IS 很低,但用你们的代码 retrain 之后模型比较正常,所以你们提供的预训练模型是否出了问题?

  3. A titan xp (set nf=32 in .yaml) or a V100 32GB (set nf=64 in .yaml) 论文中的生成鸟的 IS=4.86 是 nf=32 还是 nf=64 的时候测出来的呢?nf 是 32 还是 64 会造成较大影响吗?当然后续我也会自己试验一下的。

问题比较多,如果能够回答,不胜感激。

tobran commented 4 years ago

Q1: About the LeakyReLU active function in our DFBlock code A1: Experimental results show that there is almost no difference between using LeakyReLU or ReLU in DFBlocks.

Q2: About the released code and pre-trained models A2: We retested the released code, and the result (IS:4.85) showed that there is no problem with the code and pre-trained models. Maybe you changed the structure of the model (LeakyReLU to ReLU). We recommend re-download the code and required files according to the instructions.

Q3: About nf in *.yaml A3: The result in our paper is nf=64. The image quality grows higher when nf (channel size) increases [1]. [1] Brock, Andrew, Jeff Donahue, and Karen Simonyan. "Large scale gan training for high fidelity natural image synthesis."arXiv preprint arXiv:1809.11096(2018).

kuailefeifei commented 4 years ago

First thank you for you reply, it helped me a lot.

i have re-download the code and required files and tested your provided pretrained model, and the IS is about 4.6 (variation may because of when generating image we add noise as representation for image), i think it's normal.

so i still have a question, your provided pretrained model has nf =32, and you said you have retested the released code and the result is IS = 4.85, so was this result calculated from pretrained model that has nf =64? I would appreciate it if you can answer this question.

tobran commented 4 years ago

To get a more stable IS, we retested on 30,000 synthesized images. The IS of the pretrained model (nf=32) is 4.75. You can change the code on line 64 in main.py to control the number of synthesized images.