POSTECH-CVLab / PyTorch-StudioGAN

StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.
https://github.com/MINGUKKANG
Other
3.42k stars 342 forks source link

Training ReACGAN with fewer classes and higher resolution images #140

Open festinais opened 2 years ago

festinais commented 2 years ago

I initially trained ReACGAN with 7 classes and 64x64 resolution images, I could get fair generated samples with a FID of 40 (With 50% of training I could get a FID of 40).

Right now I'm training with only 3 classes and with 128x128 resolution images. I changed the config file for this particular number of classes and the z_embedding.
I was expecting to get better results with fewer classes and higher resolution. However, the fid is not converging yet. Now it's 40% of training and the FID is 152 - also the generated images still look unrealistic.

Any input/idea on why this might happen is appreciated! Thank you

mingukkang commented 2 years ago

I was expecting to get better results with fewer classes and higher resolution. However, the fid is not converging yet. Now it's 40% of training and the FID is 152 - also the generated images still look unrealistic.

=> I think this phenomenon atttributes to two sources: (1) lack of training dataset and (2) improper hyperparameter selction. To solve the problem, I recommend turning on differentiable augmentations for data-efficient training. After that, tunning hyperparameters in your model will solve the problem.

Best,

Minguk

festinais commented 2 years ago

Thank you for your feedback!

I tried training with all 7 classes with 128x128 resolution and FID is not converging even after 30% of training. The value is always above 200 for FID. However, when I trained with all 7 classes with 64x64 resolution, FID was converging and after 40% of training it reached the value of 45.

It is the same dataset with same classes in the end but just with different resolution. Do you have any idea why this might happen? Does it mean it is not able to generate high discriminable images?

Thank you!

festinais commented 2 years ago

Just to give you a more detailed overview of the dataset. - MURA dataset: https://stanfordmlgroup.github.io/competitions/mura/

image

I'm trying to generate synthetic data for the underrepresented classes: humerus and forearm. There is also the problem of class overlap and that's why I'm trying to use ReACGAN to generate hard negatives with good fidelity.

festinais commented 2 years ago

This is how the generated images look when I trained with all of the classes and 64x64 image size. The training completed only 50%, and I got best FID score: 45.60. I didn't use any augmentations or hyperparameter tuning.

Screenshot 2022-04-08 at 11 34 18 Screenshot 2022-04-08 at 11 32 11

Note: One of the classes collapsed: Forearm (above in the generated samples it's visible)


Now I'm training again with all of the seven classes but with resolution 128x128.

This is how the FID is converging (not converging!)

Screenshot 2022-04-08 at 11 37 03

And the generated samples:

Screenshot 2022-04-08 at 11 38 12

So this is the summary of my two experiments. I want to know the reason behind this, I have some intuition but I'm not so sure. So before I move in using different augmentations that might be helpful, I wanted to first close this and be sure what's happening. I would highly appreciate your feedback! Thank you!

festinais commented 2 years ago

Also, If you have any feedback on what type of augmentations to use or other ideas that would be helpful for this particular dataset I would highly appreciate it! Thank you.

DongChen06 commented 2 years ago

@festinais hi bro, any progresses?

asif-nuaa commented 5 months ago

@festinais hi bro, any progresses?

I am also looking for its solution. Please share if you have achieved something.