KupynOrest / DeblurGAN

Image Deblurring using Generative Adversarial Networks
Other
2.49k stars 516 forks source link

What does unaligned, aligned, single represent in the dataset_mode? #21

Closed benjaminlinken closed 6 years ago

benjaminlinken commented 6 years ago

What does unaligned, aligned, single represent in the dataset_mode?

KupynOrest commented 6 years ago

Hello, aligned dataset it the output of combineA_and_B script which produce one image containing with corresponding blurred and sharp photos unaligned dataset is used for training when you have images of class A and B in different folders and single dataset is used for testing assuming you have only the images from class A (blurred photos)

TrinhQuocNguyen commented 6 years ago

Hello KupynOrest, Thank you for your awesome code, I am retraining the model using my own data. If I use the mode: aligned, I have to put in the "train" folder the images which have width x height = 200 x 100 for example.

1.In each image, it consists of 2 parts: 100 x 100 for blurry image in the left and 100 x 100 for sharp image in the right. Then I run this command: python train.py --dataroot ./datasets/train --which_direction AtoB --fineSize 100 --loadSizeX 100 --loadSizeY 100

Is there any steps which I am doing wrongly (especially the command)? 2.Why did you crop the image in your code: loadSizeX=640 , loadSizeY=360, fineSize=256 Can we just feed entire image into the model to train? let's say: loadSizeX=640 , loadSizeY=360, fineSize=640 or 360?

Thank you.

KupynOrest commented 6 years ago

@TrinhSaiki

1. It seems that everything is okay here, does it work? 2. We can, but for training, I crop the patch of 256x256 to speed up the training process, also the recommended size to increase the efficiency in pytorch should be a power of 2, so depending on your computational resources I would recommend training on 256x256 or 512x512 patches.

TrinhQuocNguyen commented 6 years ago

Hi KupynOrest, Thanks for reply. 1. I am training the model, apparently it is working, but I have to wait to see how it'll be looked like. 2. Thank you, I understood the reason why.. I just a bit worry for losing data of the images, but it seems that it's not the big deal. 3. Do you think that we can make it real time for the video (using opencv)

KupynOrest commented 6 years ago

Actually, you are not losing any data as the model is applied convolutionally to a bigger image, so you can train on image patches and then test on the full image. I am not sure about the real time as I haven't tested the performance in different settings, in my scenario the inference time is 0.3~0.8sec. on different GPU and on images from 640x360 to 1280x720 size, so you might try, I would be super interested to see the results. However, if you need it specifically for video deblurring, there are other methods that can benefit from the additional information.

TrinhQuocNguyen commented 6 years ago

Hi KupynOrest, Thanks for your reply. One image per 0.3~0.8sec, that means 1.25 - 3.3 frames/second and it seems busting to process real time video is likely impossible. Yes, could you tell me some of those methods to process the blurry videos. As I know that the YOLO model can do real time (in classification tasks), what do you think about these GAN models, how can we make it process real time videos. Thank you.

TrinhQuocNguyen commented 6 years ago

Hi KupynOrest, I've found that my command did not use: --learn_residual, but in the test command you have showed, It has --learn_residual. It should be better to learn with --learn_residual, isn't it?

KupynOrest commented 6 years ago

@TrinhSaiki I am currently investigating the possibilities to run my model in real time, however, for Video Deblurring you can take a look at this paper - https://arxiv.org/abs/1611.08387

Also, in our work, we find that the global residual connection allows restoring finer texture details better so it should be better to learn with --learn_residual, but you still should be able to get pretty good results even without it.

TrinhQuocNguyen commented 6 years ago

Dear KupynOrest, Thank you for your reply, I have read the paper. Yes, It processed on the video, but not real time. I have tried another GAN models, but it seems that at the moment, I have not found any papers which describe processing the real time videos (which are captured from cameras), Do you know any?

Thank you for your hard work.

KupynOrest commented 6 years ago

@TrinhQuocNguyen Sorry for the late reply. You can take a look at this paper - https://arxiv.org/abs/1801.05117 which uses our work. To make it suitable for real time you'd need to optimize the generator (use lightweight architecture for example instead of ours.) If you still have some questions feel free to send me the message