xinntao / Real-ESRGAN

Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
BSD 3-Clause "New" or "Revised" License
28.17k stars 3.54k forks source link

Improvment Idea. #13

Open QLaHPD opened 3 years ago

QLaHPD commented 3 years ago

I think it would be very useful to add more discriminators, from the tests I have done with conditional GANs, it seems that having several discriminators with different levels of reception fields increases the support of the distributions as well as the stability and quality of the images (maybe can remove the artifacts like the ones on Figure 11 in the paper). It would also be interesting to try a discriminator with MLP Mixer architecture (https://github.com/jaketae/mlp-mixer, https://github.com/sradc/patchless_mlp_mixer) since the paper shows that the "way" that this type of architecture selects the features is different from what a CNN does, so maybe it helps the Generator to not create certain types of artifacts.

Also, I'm not sure, but does the ESRGAN architecture have multiple noise inputs? If not, I also think it would be useful to add noise to each res-block, since more noise usually helps.

JensDA commented 3 years ago

There's ESRGAN+ if you want to check out noise input.

QLaHPD commented 3 years ago

Yes, that's what I was talking about.

xinntao commented 3 years ago

@QLaHPD Thanks for your suggestion.

You have proposed three ways to improve.

  1. more discriminators.
  2. MLP mixer architecture.
  3. include noise input in the ESRGAN arch.

It would be very great if you have resources to have a try~

QLaHPD commented 3 years ago

Ok, I can try to code this in a fork soon, and train in Colab.

jorjiang commented 2 years ago

Ok, I can try to code this in a fork soon, and train in Colab.

how are they going