yuval-alaluf / restyle-encoder

Official Implementation for "ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement" (ICCV 2021) https://arxiv.org/abs/2104.02699
https://yuval-alaluf.github.io/restyle-encoder/
MIT License
1.03k stars 156 forks source link

bad results #61

Closed Jimzhou82sub closed 2 years ago

Jimzhou82sub commented 2 years ago

Could you help me ? 1

MartFire commented 2 years ago

I think the model expects aligned images. I'm not sure if the model is overfitted on FFHQ or it needs very high-resolution images as input (1024x1024) but apart from images from FFHQ (the model trained on these images), all the reconstructions were quite poor when I tried other datasets. Plus the additional steps make it worse, the network tries to reduce the pixel loss but the face doesn't look human anymore.

Jimzhou82sub commented 2 years ago

Ok, I got it, thanx a lot, best wishes for you

yuval-alaluf commented 2 years ago

All the results posted above are unaligned and are therefore out of distribution to what the StyleGAN generator can generate. Please align all your faces before passing them to the encoder.

I'm not sure if the model is overfitted on FFHQ or it needs very high-resolution images as input (1024x1024) but apart from images from FFHQ (the model trained on these images), all the reconstructions were quite poor when I tried other datasets.

I think this conclusion is a bit misleading. The model is not overfitted to FFHQ since it also achieves very good reconstructions on images from other datasets (e.g., CelebA-HQ) and images in the wild (e.g., the images shown in the repo here. The input also does not need to be high resolution (1024x1024). The encoder itself was trained on 256x256 images from FFHQ so there is not need for inputs to be at such a high resolution. In fact, the inputs can be 256x256 while the outputs will be 1024x1024, so in some cases, the encoder can actually improve the resolution of the input image.

All this is to say that when working on faces and StyleGAN2, l the inputs must be aligned before passing them to the encoder.

Jimzhou82sub commented 2 years ago

I rotated the picture and get a good result, pretty job ! Speaking from my experience, I think the photo's length-width ratio plays a pretty important role.

Ir1d commented 2 years ago

@Jimzhou82sub Hi, what do you mean by "photo's length-width ratio plays a pretty important role"? Did you resize the image with different ratios? Thanks!