eladrich / pixel2style2pixel

Official Implementation for "Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation" (CVPR 2021) presenting the pixel2style2pixel (pSp) framework
https://eladrich.github.io/pixel2style2pixel/
MIT License
3.19k stars 570 forks source link

Training/Testing dataset #53

Closed minha12 closed 3 years ago

minha12 commented 3 years ago

Thank you for your great work! I'm about to try to reproduce the FFHQ encoder. The problem may not from your code but from the datasets preparation . The Nvidia FFHQ dataset is available for resolution of 1024x1024 (~90GB), and I couldn't download this one due to limited quota of Google Drive. So, I have some few (beginner) questions:

  1. Could I have a link to download more lightweight FFHQ 256x256 somewhere?
  2. If I download the 1020x1024 (70k images) dataset and put it to configs/paths_config.py, would the images be automatically resize to input size of 256x256 while training?
  3. In the readme, you suggested to use CelebA-HQ as test dataset ('celeba_test': '/path/to/CelebAMask-HQ/test_img',), is this necessary? I mean can we just split the FFHQ into train/test (maybe 80/20, but I don't see that I could do this in your code?). What is the size of test dataset that you recommend?

I know those questions may be trivial for you but it is vital for me to start the training.

yuval-alaluf commented 3 years ago

Hi @minha12

  1. Unfortunately, we do not have a way to share the FFHQ dataset. I recommend trying to download the dataset from the original repository every few hours or so.
  2. Yes. Please refer to the following transforms defined for training our encoder: https://github.com/eladrich/pixel2style2pixel/blob/38300fae448a8cb4cc47694dce6cd6e61c3ba2ca/configs/transforms_config.py#L23-L27 Here, we resize the images to a size of 256x256.
  3. It is not necessary to use the CelebA-HQ dataset as your test set. You can split the FFHQ into train and test if you wish. However, note that the StyleGAN generator was trained on all FFHQ images therefore using some of the images as test data may be considered "data leakage".
minha12 commented 3 years ago

Great! I found a resized FFHQ (256x256 - 2GB) on Kaggle: https://www.kaggle.com/xhlulu/flickrfaceshq-dataset-nvidia-resized-256px However, the structure of dataset folder is slightly different from the original one (without subfolders 1000, 2000, ...). I hope it will work without significant change. As you suggested in the paper, I will split the CelebA-HQ into 24000/6000 and use these 6000 images for a test set. Thanks again for your help!