Open imlixinyang opened 5 years ago
You noticed the interesting fact. Thank you for asking the question. To be brief, there are three settings: (1) StarGAN trained on CelebA (2) StarGAN trained on CelebA and RaFD (3) StarGAN trained on CelebA-HQ.
StarGAN paper introduced their good result by (2). RaFD contains numerous high quality faces (256x256) but it is a private dataset which we cannot access. What we can do is "to try to reproduce the 256x256 result with only CelebA", which is (1). (1) gives a not bad result. However, the difference between CelebA and CelebA-HQ is not only the quality but the quantity. CelebA has about 200,000 faces whilst CelebA-HQ has only 30,000 high quality faces.
We deduce that StarGAN suffers from the scarcity of high quality faces in (3). If we have CelebA-HQ and RaFD together, we might be able to get good result with StarGAN. On the other hand, our method takes advantage of matching-aware discriminator for the conditional labels, where StarGAN uses a classifier. It makes the GAN overcome the scarcity problem. So our method can still get good result with only 30,000 CelebA-HQ faces.
I found that in your paper in page.6, the results of stargan is really poor. I reproduced stargan in celeba-hq and got bad results too. But in stargan paper, it is good to generate 256x256... So i'm a little confused, (Once i train stargan in celeba, the results is good enough, but in celebahq, it cannot succeed even to reconstruct the image).