LS4GAN / uvcgan2

UVCGAN v2: An Improved Cycle-Consistent GAN for Unpaired Image-to-Image Translation
https://arxiv.org/abs/2303.16280
Other
125 stars 19 forks source link

How to use UVCGANv2 for grayscale image conversion? #19

Open y-h-Lin opened 10 months ago

y-h-Lin commented 10 months ago

Hi, I am trying to use UVCGANv2 for grayscale image (8-bit) conversion (both domains are grayscale images). My current approach is to convert 8-bit grayscale images into RGB images and then train the models. However, the final images are all white. I was wondering if you have any recommended approaches for handling this? Thank you!

usert5432 commented 10 months ago

Hi @y-h-Lin,

However, the final images are all white. I was wondering if you have any recommended approaches for handling this?

I think, the following steps would be a good starting point to investigate this issue:

  1. Check if uvcgan2 code loads images correctly. If you translate images with translate_images.py, then it will save the untranslated input images in real_a and real_b directories, next to fake_a and fake_b. It may be worth looking over the real_a and real_b images to ensure that they are correct (not all white). If they are not, then there is probably a bug somewhere in the image loader code, and we will need to fix that.

  2. If uvcgan2 loads images correctly, but the translations looks all white, it is possible that the training has diverged. We can look at the training losses to check if that happened. Could you please post the last line of the history.csv file (should be saved in the model directory) here -- it will inform us whether the training has diverged or not?

y-h-Lin commented 10 months ago

Thank you for your reply.

Regarding point 1, I believe the image loader is functioning properly. The _reala and _realb directories display normal images, however, the other four directories (namely _fakea, _fakeb, _recoa, _recob’) are showing white images.

Below is the final line from my history.csv:

gen_ab | gen_ba | cycle_a | cycle_b | disc_a | disc_b | idt_a | idt_b | gp_a | gp_b -- | -- | -- | -- | -- | -- | -- | -- | -- | -- 0.998819 | 0.998798 | 1.414826 | 1.512557 | 0.000296 | 0.000291 | 0.707413 | 0.756278 | 0.000256 | 0.000243

usert5432 commented 10 months ago

Hi @y-h-Lin,

Thank you for elaborating on this issue. Looking over the training losses, I can definitely say the the training has diverged. (Normally, gen_ab ~ gen_ba ~ 0.2-0.8 and cycle_a ~ cycle_b ~ 0.2-0.5). I think, a few things can be done about it:

  1. Try increasing lambda_a and lambda_b parameters of the network configuration. If your images are dominated by a single color (e.g. large white background), then it may be necessary to do regardless. E.g. try increasing these values: https://github.com/LS4GAN/uvcgan2/blob/f74160381048ed753f1740c99a12892eaa827f6f/scripts/celeba_hq/train_m2f_translation.py#L109-L110

  2. If that does not help, try increasing magnitude of the gradient penalty lambda_gp, e.g.: https://github.com/LS4GAN/uvcgan2/blob/f74160381048ed753f1740c99a12892eaa827f6f/scripts/celeba_hq/train_m2f_translation.py#L123

  3. Alternatively, you might already have satisfactory translations. In some cases, training progresses smoothly for many epochs and only diverges towards the end. You can check the history.csv file to identify any epoch where gen_ab suddenly jumps above say 0.8. If you find such an epoch, it's possible that the training was successful before that point. You can try translating images using a network from before the divergence occurred. To do that, you can use the command: translate_images.py --epoch N, where N is an epoch before the divergence. Networks are typically saved every 50 epochs by default, so N should be a multiple of 50: https://github.com/LS4GAN/uvcgan2/blob/f74160381048ed753f1740c99a12892eaa827f6f/scripts/celeba_hq/train_m2f_translation.py#L138

Finally, you may have an inherently difficult dataset to translate. In this case, some dataset specific workarounds need to be developed.