junyanz / pytorch-CycleGAN-and-pix2pix

Image-to-Image Translation in PyTorch
Other
22.8k stars 6.29k forks source link

custom dataset with different sizes #1468

Open youjin-c opened 2 years ago

youjin-c commented 2 years ago

Hello, @junyanz

Thank you for your superb paper and repo!

I am trying to make a normal face -> smiling face model. I found some face image datasets from Kaggle that were randomly sized. I read the pre-processing steps in the doc that there is a script that crops width given resolution or ratio. So I wonder if I can just use a dataset that consists of random sizes.

Best, Youjin

icelandno1 commented 2 years ago

@andyli I had a similar problem. I have read “Training and Test Tips” but still have some questions on how to start training if my dataset is 1000x520, 500x310, 870x600 type of images with no pattern? And when using the model, I would like to output the original size images. The dataset is somewhat similar to this one: iphone2dslr_flower, but I didn't find an example of this, so I don't understand how to train and use the model. Any guidance would be much appreciated.

taesungp commented 2 years ago

Hi, there are two possible ways.

One is just doing random cropping without resizing. You specify --preprocess crop, and set --crop_size to be the smallest side length (310 in this case). And at test time, you can run it at the original rectangular resolution, because the network is fully convolutional. However, this method may not be optimal, because the size difference between training and test time is quite large: 310x310 square crops at training time, but 1000x520 full images at test time.

The other way is to first apply resizing to put them at similar resolution. You specify --preprocess scale_width_and_crop, and set --load_size 1000 and --crop_size 520 so that the images are loaded with scaling to have 1000px width, and make random crop at the smallest side length (520 in this case, since 500x310 will become 1000x620 after resizing).

At test time, you test with --preprocess scale_width, so that all images are processed at 1000px width. After this, you can manually resize the output images to the original sizes like 500x310.