regarding to the test dataset preparation

yunjey / stargan

StarGAN - Official PyTorch Implementation (CVPR 2018)

MIT License

5.23k stars 970 forks source link

regarding to the test dataset preparation #117

Open zyzhang1130 opened 4 years ago

zyzhang1130 commented 4 years ago

Hi, I would like to clarify the following confusion:

Why should the testing data set be split as well? I understand that at the training stage the model needs to understand which image belongs to which category(e.g. happy, sad etc.),but for testing there should be only one category of input and the model is supposed to generate all the categories same as for training right?

Correct me if I'm wrong. Thank you.

yunjey commented 4 years ago

@zyzhang1130 For categorical setting (e.g. RaFD), we don't need the original label of an input. The reason why we split test data is to use ImageFolder as our dataset class. You could implement your own dataset class to avoid splitting test data.

For binary setting (e.g. CelebA), we need the original label of an input to perform single or multiple attribute transfer.

zyzhang1130 commented 4 years ago

Yeah I am using my own dataset in the similar fashion as RaFD with the given Test StarGAN on custom datasets code. I setup my training data files as per instruction given but I still got error saying I did not split up my testing data. Actually does it mean I shouldn't use test command to generate new images? If yes then which command should I use?

Thank you.

zyzhang1130 commented 4 years ago

I used this one to generate new images: python main.py --mode test --dataset RaFD --image_size 128 \ --c_dim 6 --rafd_image_dir data/RaFD/test \ --sample_dir stargan_rafd/samples --log_dir stargan_rafd/logs \ --model_save_dir stargan_rafd/models --result_dir stargan_rafd/results

yunjey commented 4 years ago

@zyzhang1130 Could you upload the detailed error message?

zyzhang1130 commented 4 years ago

RuntimeError: Found 0 files in subfolders of: data/RaFD/test Supported extensions are: .jpg,.jpeg,.png,.ppm,.bmp,.pgm,.tif,.tiff,.webp But actually I put my un-grouped .jpg image (for generation) in that path. The fact that it was trying to search the sub-folder makes me think the code assumes there are images in different categories under data/RaFD/test

yunjey commented 4 years ago

@zyzhang1130 As I mentioned in README, you should create a folder structure as described in here. Create sub-folders in 'data/RaFD/test' and put your image into the corresponding sub-folder.

zyzhang1130 commented 4 years ago

Sure but my testing images only have one single instance of portrayal/expression for each person, since my objective is to generate multiple expressions for each person. So how should I put those images in different sub-folder? It feels like defeating the purpose.

yunjey commented 4 years ago

@zyzhang1130 If there is no image for a particular domain (i.e. expression), leave it in an empty folder.

zyzhang1130 commented 4 years ago

May I further clarify that the reason you want to put test data in separate sub-folders is it because data in test folder was used during the training stage for validation purpose? And is there an underlying assumption that the input for generation must belong to one of the domain? To illustrate, in the example below, if the input does not belong to any of the existing eight expressions, how do you decide which sub-folder to put it in? Thank you for the clarification.