junyanz / pytorch-CycleGAN-and-pix2pix

Image-to-Image Translation in PyTorch
Other
23.02k stars 6.31k forks source link

help with the colorization dataset mode #478

Open ajaykmr2292 opened 5 years ago

ajaykmr2292 commented 5 years ago
  1. I just saw the latest push on colorization dataset-mode. I followed the steps in Notes for colorization in tips.md and trained the model. Can you tell me how to test the model? If I run the test.py code using the command python test.py --dataroot ./datasets/MY_DATASET_DIR/ --name color_pix2pix --model pix2pix_colorization --phase test, its throwing error

AttributeError: 'Pix2PixColorizationModel' object has no attribute 'real_B_rgb'

Please tell me how to run the test.py file for colorization dataset mode.

  1. Also, I assume the coloring dataset directory must be inside the dataset directory and must take the form ./datasets/MY_DATASET_DIR_NAME/train containing train images (just the colored images) and ./datasets/MY_DATASET_DIR_NAME/test containing test images (again just the colored images). I hope thats correct. If its correctm when I trained the model, the number of training images is shown as the sum of the images in the train and test directory. Can you please tell me what is the issue?

  2. I trained the model for colorization using the earlier code (before pushing colorization dataset mode). For 256x256 images, the model was not overfitting while the model was overfitting for 512x512 images for the same model architecture. Can you explain the possible reasons please?

junyanz commented 5 years ago
  1. I fixed the issue with the latest commit.
  2. Yeah, add your images to train and test directory. I also added a new script scripts/test_colorization.sh.
  3. I am not sure about overfitting. You can try both resnet_9blocks and unet_256 architecture.
ajaykmr2292 commented 5 years ago

I am new to GANs. I have this query: If we cannot interpret anything from loss functions and suppose the generated images are not good, how can we understand what parameters (or model architecture) to change and check?

junyanz commented 5 years ago

Loss function can be used to identify failure mode as suggested by ganhacks. For example, If D loss is always 0, maybe D is too strong. You can increase the capacity of G.

ajaykmr2292 commented 5 years ago
  1. Please help me with this: I have trained the model on 25 odd images for converting binary manga images to colored images (I know the dataset size is very small but atleast I expect the model to overfit for 25 odd images first with the unet_256 architecture which is not happening). The loss curves are like this: loss_curves

When the binary input image used for training is sent to the generator (after training for 500 epochs), the result is like this: Input Binary Image Input Binary Image Real colored image Real colored image Generated color image Generated image

And for test set, the output is like this: Input binary image 0002-011 png_real_a Generated color image 0002-011 png_fake_b

As it can be seen, the generated image for both train and test set are worse. The loss seems to be normal but the generator still didnt learn anything useful (and it didnt overfit the training data even after 500 epochs as can be seen from the generated image for the training input image). Can you suggest anything for this purpose? I feel a single input image has too many features for the generator to learn while training. So, is it better to use a much complex architecture like inception module (GoogLeNet architecture)? Please suggest some other alternatives to try if any.

I feel this task is an alternative to edges2shoes conversion except the fact that the input image has too many features to learn at a time. Please tell me whether I am right and also suggest something to try.

  1. Also, can you tell me how to find whether the vanishing gradient is the reason for network performing poorly?
  2. ganhacks says D loss goes to 0: failure mode Is it failure mode for my loss curve for the discriminator?
  3. Also, ganhacks says real label has to be given a number between 0.7 and 1.2 randomly but in the implementation, the real label is fixed as 1..right? Can you tell me whats the intuitive impact of giving a random number between 0.7 and 1.2?
ajaykmr2292 commented 5 years ago

Also, inorder to understand the concept indepth, I tried to train the pix2pix model on edges2handbags dataset - 138k images (downloaded using your code)..after training for 15 epochs with a batch size of 32, the generated image is like this: 116_ab_fake_b Can you tell me whats going wrong? How did the paper get so good images? I am just running the same code. Thanks in advance.

junyanz commented 5 years ago
  1. For your own task, you probably need more images. 25 might not be enough. You probably need thousands or even tens of thousands of training images. You may also want to check out colorization-PyTorch repo. You can fine-tine a pre-trained colorization model on your training images.
  2. 0.7-1.2 might help stabilize GANs training. You are free to modify our code. You can also try other GAN loss (e.g. LSGAN) within this repo. Please see Section 3.4 in improved-GAN paper for more details.
  3. For edges2hangbags, we used batchSize =4 in the paper with Torch code. Also, see Figure 21 in the paper for typical failure cases of pix2pix.
ajaykmr2292 commented 5 years ago
  1. Thanks for your patient reply. But then, there was a case where the model was trained to watercolor the black and white image (a church kind of an input image..an issue was raised here about that. Issue) which used just 2 training images but after training, it was found to work. Can I know why then its not working in my case?

  2. Also, for me, 256x256 input images are not overfitting whereas for 512x512 images are found to overfit (they are found to be colored exactly like the training image)..can I conclude that this is because there are too many small features in the image which the 3x3 filters used in the generator cannot recognize when 256x256 image is used while they can recognize the features when 512x512 images are used?

ajaykmr2292 commented 5 years ago

Also, I am learning to train GANS (training deep networks for that matter) via DIY. I would like to know how to go about training GANs. Are these tweaking parameters, loss functions and then checking the output completely based on trial-and-error or does there exist any logic behind trying every change? Please help me with some tips on how to go about training GANs (or any deep networks).

junyanz commented 5 years ago
  1. Colorization might be different from artist style transfer. From my experience, 25 images might not be enough.
  2. I don't have a good answer. To prevent overfitting, you may want to apply some random cropping and flipping. (e.g,. for 512x512 images, use --load_size 600 --crop_size 512. )
  3. For GANs training, Section 3.4 in improved-GANs does tell some logic behind the trick. I would also recommend that you read Ian's tutorial on GANs.
superkido511 commented 4 years ago

@ajaykmr2292 Can you tell me where you get the one piece dataset? I've been looking all over the manga sites but cannot find paired color/uncolor version

sun11711 commented 7 months ago

请问可以提供损失曲线图的代码吗