junyanz / CycleGAN

Software that can generate photos from paintings, turn horses into zebras, perform style transfer, and more.
Other
12.34k stars 1.94k forks source link

Training and Testing #93

Closed mhusseinsh closed 6 years ago

mhusseinsh commented 6 years ago

Hello,

i am currently working on the CycleGAN network using my own dataset, but I have some questions: 1- I have more images from domain A ~650,000 compared to domain B ~ 20,000. So my question is, will it be better to have the same/average amount of images equal for both domains ? or it doesn't matter that much ?

2- I don't find an idea of splitting my dataset into train and split, so in what exactly did you use the test split set ? is it used during training for any means ?

Thanks in advance

mhusseinsh commented 6 years ago

Hello again,

as far as I get, that maybe I need test set for the images from domainA, which will be translated to domainB, because this makes sense, that after training the network, I will provide unseen data, and wait for the generated images. But why can I need test set for domain B ?

doantientai commented 6 years ago

I think most of us will use CycleGAN for translating from domain A to B only. So we usually need to test only one Generator. However, in some cases people need to use it in both directions. That's why it require images in both testA and testB by default. If you just wanna test A->B and you run out of images to put in testB (like me), just copy images from testA to testB to make it work.

mhusseinsh commented 6 years ago

Hello, Thanks a lot for your response. what is your suggestion concerning the first question ?

doantientai commented 6 years ago

Because images in A and B are not paired, so I think the balance number would not be a serious problem. Of course if you have less images in B, then you have lower diversity in that domain, but we can't change that anyway. So what I could suggest is that you can try augmenting the data in set B (rotate, flip, change light,...) to just make us of it a bit more.

mhusseinsh commented 6 years ago

hello again, I am sorry, I have a quick question. Do we need to do random cropping for images as a preprocessing step ? because in the tensorflow implementation here, they are doing a resize for all images to 286x286, then a random_crop. I wonder why do we do/need this ?

mhusseinsh commented 6 years ago

and another quick question too, the size of the dataset in domain A is 200x88. while the size of that in domain B is 1244x376. It is clearly that there is much a big difference in the resolution, how much can this affect the results ? I thought about resizing one domain to be like the other one ? is that the right idea of how to work with this ? and in case I resized domain A to be like B, large images can be handled with no problems ? @doantientai @junyanz

junyanz commented 6 years ago

@mhusseinsh If you use loadSize=286, fineSize=256, the code will automatically resize all images to 286x286 and then randomly crop 256x256 patches. We assume that the image resolutions of two domains are roughly the same. For larger images, you may have GPU memory limit issues during the training. You can also resize images A and B to something in between.

mhusseinsh commented 6 years ago

Hello @junyanz ,

Thanks for your reply. I am really confused now a little bit. I want to apply the same network on my larger dataset which is 1244x376 pixels. According to what, you did choose these numbers ? So why are the training images which are already 256x256 are resized to a bigger size 286x286 and then randomly cropped from ?

for my high resolution dataset, how much shall I set my loadSizeand fineSizeto ?

Deeplearning20 commented 6 years ago

@mhusseinsh @junyanz Can train A and trainB be different in size? My B is 750562 and A is 7030. Or is it smaller in size? For example, 4040? The picture size must be 256256,isn't it?

mhusseinsh commented 6 years ago

@Deeplearning20 in my case, they were also different in sizes, but I resized one domain to be equal to the other. I don't know if the network can support different sizes or not. In my opinion, it won't support However, my sizes are not 256x256. they are different, and accordingly, I edited some variables in the code.

@junyanz what is your opinion ?, also I have a question, I think that resizing and random cropping, are considered hyperparameters to be tuned, I mean I don't have to stick to a certain size. Because in my case, I had rectangle pictures, so I resized to a rectangular shape (not a square one), but I did random square croppings. What is your opinion towards this ?

Deeplearning20 commented 6 years ago

@mhusseinsh Thank you for your reply! @junyanz Hello, how can the minimum number of pictures be? How big can it be? Does length and width need to maintain a certain percentage?

junyanz commented 6 years ago

@mhusseinsh Yes, loadSize and fineSize are two phyperarameters that may affect the results. Even you apply square cropping during trainning, you can still test the model on a rectangle test image. That is what we have done for horse->zebra. During the test, you need to specify resize_or_crop="scale_width". @Deeplearning20 The more images, the better. But CycleGAN can produce resonable results on small datasets (e.g., 200 images, 1000 images. ) No need for length and width to be a certain ratio. Check out the resize_or_crop flag. It can allow different kinds of preprocessing steps.

paviddavid commented 5 years ago

@junyanz I am a bit confused what exactly does resize_or_crop. I am working with a dataset containing images of 512x256, or 1024x512 depending on my memory capacity. However, the ratio should be 1:2. How should I go on? Which value for resize_or_crop do I need?

junyanz commented 5 years ago

See this line for the usage. You can use scale_width or scale_height. Also, see our Pytorch implementation for better pre-processing support.