CelebA-HQ results - Githubissues

Hangz-nju-cuhk / Rotate-and-Render

Code for Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images (CVPR 2020)

Creative Commons Attribution 4.0 International

490 stars 112 forks source link

CelebA-HQ results #6

Closed IQ17 closed 4 years ago

IQ17 commented 4 years ago

Hi, the CelebA-HQ results on paper are quite impressive! Yet I tried some CelebA-HQ(1024*1024) images, which are selected randomly, there are some checkboard artefacts, please refer to the attachments.

I think the gan model is for 256 resolution, do you use another model for those high resolution images? Also, any suggestions on setting parameters or examples would be appreciate.

Thanks!

yaw_0 0_94

yaw_0 52_94

IQ17 commented 4 years ago

By the way, on some 256 resolution image, there is a feeling like that insufficient number of rays are traced. Are there any parameters for more dense rendering?
yaw_0 0_97

Hangz-nju-cuhk commented 4 years ago

@IQ17 Hi, our model indeed supports 256 resolution only. We found that our network structure is not enough to handle 1024 images. Maybe more advanced network structure (pix2pixHD, progressive growing, etc) and losses are needed.

IQ17 commented 4 years ago

Thanks, indeed something like pix2pixHD would be a good way to try :)

By the way, on some 256 images, after the GAN, the images give a pencil drawing like style, for example, the image post above. I notice that input to the GAN, i.e. the rotated and rendered image, are good. Any suggestion?

Hangz-nju-cuhk commented 4 years ago

@IQ17 This is a strange situation. My hypothesis is that the ''domain'' of your input image is not typical in our training set (unusual texture style), or the 3D fitting to this image is not fully accurate.

IQ17 commented 4 years ago

Hi, I compared my samples with the examples in the git repo, and I found out that the resolution of input image is responsible for the pencil drawing artefact.

Generally speaking, the higher the input resolution, the more visible the artefact. Specifically, your GAN is trained with 256 resolution images, yet the face inside the image is roughly 128 resolution, which means the texture will be unsampled to 256. Consequently, the GAN is more familiar with those kind of blurry 256 texture.

When testing with input of images where facial areas are larger than 256, the texture will be down sampled to be a sharp 256 texture, the shaper 256, the less suitable for the GAN and causes the artefact.

So, maybe two solutions 1) resize the input image such that the facial height is less than 200 and run test; or 2) finetune the GAN with sharp 256 inputs.

Just some opinions, wonder if they are correct.

IQ17 commented 4 years ago

By the way, forget the image on the second post, which seems to be a special bad case.

Hangz-nju-cuhk commented 4 years ago

Yes, I think you are correct! The reported results on CelebA-HQ are indeed finetuned ones. The model we released were trained on MS1M and the general resolution of the dataset is indeed quite low. As we do not rely on supervision, you can actually finetune it on any dataset for better visualization.

IQ17 commented 4 years ago

Thanks for your kind instructions and great work! I shall try finetune later on :)