twtygqyy / pytorch-LapSRN

Pytorch implementation for LapSRN (CVPR2017)
MIT License
281 stars 79 forks source link

Question about data augmentation #5

Open yuanshuai220 opened 7 years ago

yuanshuai220 commented 7 years ago

Thanks for your code, it helps me a lot. But I have some questions about data augmentation. In the generate_train_lap_pry.m, you only used downsizing to make more training data. While in the paper, the author augments the training data in three ways, scaling, rotation and flipping. Your performance is better than the paper, but your training data only has 7488 examples. I'm confused about it.

ZhangDY827 commented 7 years ago

@yuanshuai220 Hi, I am reproducing the paper result recently. The training data provided is a tiny sample, you can collect the BSD200, T91 and general100 total 391 images as your training dataset using generate_train_lap_pry.m. I get the training datasets size of (11712, 1, 32, 32). After 200 epochs, I get average psnr 31.32 on Set5 for 4X. After several test, I find that the training datasets play a important roles in resluts. The more richer training datasets is, the better result you will get. Meanwhile data augmentation is also important, you can add scaling, rotation and flipping function in generate_train_lap_pry.m script by yourself.

yuanshuai220 commented 7 years ago

@CasdDesnDR I agree with you. If the trianing data is not enough, the nerual network will overfit with the training set. So the performance on test set is not good. I will add rotation and flipping in the generate_train_lap_pry.m

twtygqyy commented 7 years ago

@yuanshuai220 @CasdDesnDR Please refer https://github.com/twtygqyy/pytorch-SRResNet/blob/master/data/generate_train_srresnet.m for adding flipping and rotation

baiyancheng20 commented 7 years ago

@twtygqyy Hi, Thank you for sharing your code. I want to know why you convert RGB images into YCbCr colour space and only use the Y channel information. How about the results directly using all RGB channels?

twtygqyy commented 7 years ago

Hi @baiyancheng20, I followed the LapSRN paper for the implementation. Actually, you can check https://github.com/twtygqyy/pytorch-SRResNet which I used RGB image as inputs.

sriprabhar commented 6 years ago

@twtygqyy Hi, Thank you for sharing your LAPSRN code. I took your pytorch code from Git-hub and executed it. It works only for grayscale images. I modified the lapsrn.py to extend support for RGB color images. Then I took just one color image from Urban100 dataset. On this image, I performed augmentations as given in your matlab code (generate_train_lap_pry.m). I got around 165 color image patches of size 128x128. Using these image patches (h5 file ) I trained the network for 100 epochs. For testing I modified the test.py for color images, and gave as input, a 32x32 cropped image from the original training image. The results are very poor. I'm not sure where I'm going wrong. I have attached modified codes and my results. I request that I may be kindly given technical advise as to how I can proceed further to get the correct results.

sourcefiles.zip

twtygqyy commented 6 years ago

@sriprabhar Hi, I understand that you tried to overfit the network on a small dataset. How is the loss looks like in your training? Did it converge well?

sriprabhar commented 6 years ago

Thanks for your response. I took the building image (attached) and took several overlapping patches. Training trial 1

stride = 64 and number of patched = 15x11 patches of size 128x128. Convergence noticed. Attached Plot no 1. Trained for 100 epochs

Training trial 2

stride = 16 and number of patches were 41x57, each of size 128x128. Convergence noticed. Attached Plot no 2. Trained for 5 epochs using training trial 1 model as pre-trained model.

Test image

  1. Image of size 32x32 cropped from building image
  2. Image patch taken from building image used for training. This image is downsampled to 32x32.

I'm not sure how to solve, if its an overfitting problem. Please have a look at the attachments.

figure_trainingtrial1 figure_trainingtrial2 figure_building figure_patch building

sriprabhar commented 6 years ago

Hi, Also, for training, we have to create dataset in hdf5 format using matlab code. For creating h5 file using patches of single image, the file size is huge. (for 165 color patches, h5 file size is around 500 MB for 57x41 patches the file size is around 3GB) If I have a folder containing around 50 images of size 1080x1080 and if I run the matlab code for RGB color images, the system hangs. I'm not sure if I'm following the correct methods for dataset creation and training. Thanks for any kind of help/suggestions

twtygqyy commented 6 years ago

@sriprabhar Hi, I think the way how you generate the h5 is correct, while you will probably get to many small patches out of 1080x1080 because of the huge size. (3GB is not that big, TBH : ) )

A quick way to solve this is to change the stride when you run the matlab code for generation. The best way to solve this is to generate multiple h5 files, create a new generator with the folder contains h5 files as input, and fetch data from one h5 at a time.

twtygqyy commented 6 years ago

@sriprabhar also the result you plotted makes sense to me. Cause the image you tested might not be the exactly same one as you used in training. Grab one image from h5 which you used for training, and see if the result looks better.

sriprabhar commented 6 years ago

Thank you for your response, I will give the training patch and try. Also, one more doubt is that, the LapSRN works for Y component alone, we combined the bicubic interpolated cb and cr and merged with the Lap-srn super resolved y component. The results were good. If y component is sufficient for training and for PSNR measurement, then I would like to know why we have to train for RGB images (like in SRResNet)

twtygqyy commented 6 years ago

Hi @sriprabhar you can have a look at section 5.1 in this paper Fast and Accurate Image Super-Resolution Using A Combined Loss. They compared the difference between training with Y and RGB for SR.