About the bad results of stargan in your paper.

You noticed the interesting fact. Thank you for asking the question. To be brief, there are three settings: (1) StarGAN trained on CelebA (2) StarGAN trained on CelebA and RaFD (3) StarGAN trained on CelebA-HQ.

StarGAN paper introduced their good result by (2). RaFD contains numerous high quality faces (256x256) but it is a private dataset which we cannot access. What we can do is "to try to reproduce the 256x256 result with only CelebA", which is (1). (1) gives a not bad result. However, the difference between CelebA and CelebA-HQ is not only the quality but the quantity. CelebA has about 200,000 faces whilst CelebA-HQ has only 30,000 high quality faces.

We deduce that StarGAN suffers from the scarcity of high quality faces in (3). If we have CelebA-HQ and RaFD together, we might be able to get good result with StarGAN. On the other hand, our method takes advantage of matching-aware discriminator for the conditional labels, where StarGAN uses a classifier. It makes the GAN overcome the scarcity problem. So our method can still get good result with only 30,000 CelebA-HQ faces.

elvisyjlin / RelGAN-PyTorch

About the bad results of stargan in your paper. #4