SeungjunNah / DeepDeblur_release

Deep Multi-scale CNN for Dynamic Scene Deblurring
671 stars 142 forks source link

Large values in output #17

Closed ghost closed 6 years ago

ghost commented 6 years ago

Hello again,

during my implementation i encountered a problem the input value is between [0,1] but the output value of the net (predictions, images after moving through the net) is between [-10^7,10^7] . i tried to do sanity check and set Scales = 1. the training was on 2 images. the purpose was to make sure the net can learn 2 images and over fit those images. even after 40,000 iterations i got values of the output between [-7,7] and not smaller than that. maybe there is some relu or BN that is not in the paper Or did you had the same problem?

Thanks Sivan

SeungjunNah commented 6 years ago

No. I didn't add any other ReLU or BN that is not mentioned in the paper. Actually, they are not helpful for image restoration. You may check the analysis described in my other work: EDSR For python reference, you may want to check out pytorch implementation: EDSR-pytorch

First, I would recommend you not to use adversarial loss for now and use L1 or L2 loss only. Here are some checklists: The initialization: I used Xavier initialization which is default on torch. Learning rate: I used 5e-5. Too big learning rate may lead to odd results. optimizer: ADAM Training loss: What is the shape of loss curve? Is loss going down? Is it stable? Did it converge? If you are using python, you could use Tensorboard to check various values with visualizations.

In practice, it is quite easy to regress the output that is similar to input. There could be simple mistakes that are easily amendable.

ghost commented 6 years ago

Hi i didn't mention it but i'm using only L2 loss for sanity check. i also use Xavier which is default in TF also. i use the same learning rate that you wrote in the paper which id 5*10^-5. i use ADAM The training loss going down from 10^7 to 0.3 and it converge (i used tensorBoard for this)

even though the loss is going down i see(by using prints) that the minimum and maximum values of the output are not between [0,1] but [-7,7] after 40,000 iterations. the loss is not zero even after a lot of iterations. i also see the graph with tensorboard and everything connected fine. i can't think of a cause for this problem

2017-12-21 11:02 GMT+02:00 Seungjun Nah notifications@github.com:

No. I didn't add any other ReLU or BN that is not mentioned in the paper. Actually, they are not helpful for image restoration. You may check the analysis described in my other work: EDSR https://github.com/LimBee/NTIRE2017 For python reference, you may want to check out pytorch implementation: EDSR-pytorch https://github.com/thstkdgus35/EDSR-PyTorch

First, I would recommend you not to use adversarial loss for now and use L1 or L2 loss only. Here are some checklists: The initialization: I used Xavier initialization which is default on torch. Learning rate: I used 5e-5. Too big learning rate may lead to odd results. optimizer: ADAM Training loss: What is the shape of loss curve? Is loss going down? Is it stable? Did it converge? If you are using python, you could use Tensorboard to check various values with visualizations.

In practice, it is quite easy to regress the output that is similar to input. There could be simple mistakes that are easily amendable.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/SeungjunNah/DeepDeblur_release/issues/17#issuecomment-353296282, or mute the thread https://github.com/notifications/unsubscribe-auth/AeKQOJKC1Y7476I6NDE451hFEsanOluqks5tCh65gaJpZM4RJhH8 .

SeungjunNah commented 6 years ago

Did you check if the image pyramid is correct?

Have you tried identity preserving task? (Sharp image -> model -> sharp image) That should be more easier than deblurring, and could be a good checkpoint. If this figuration does not work, there should be critical problems in your implementation.

If above are ok, well, I can't figure out the cause without knowing the implementation details. But I doubt would be able to help you with tensorflow implementation since I'm not familiar with the framework.

ghost commented 6 years ago

hi yes i checked the pyramid and i also checked that after augmentation the input and target are still between [0,1]. in this simple test i set scales = 1 so it's definitely not the pyramid. i also set target ->target. if i reduce the number of ResBlock to 1 it works fine. i think it might be differences in the initialization between torch and TF. do you know what version of xavier initialization you used? (uniform\gaussian?) Thanks! Sivan

by the way, when you say it's supposed to be 'quick' to train (Sharp image -> model -> sharp image). if i try to get overfit on 2 images, how many iterations you think it should be? you used different gpu so i don't sure what considered quick for me

SeungjunNah commented 6 years ago

Torch uses uniform distribution for conv layer initialization. You may want to check out nn.SpatialConvolution:reset function in this link.

ghost commented 6 years ago

Hi thanks, and about the iterations that target to target should be, do you remember how much it took?

SeungjunNah commented 6 years ago

Sorry, I don't remember.

ghost commented 6 years ago

HI again! i think i found the bug. you mention in your paper that:

Then, 19 ResBlocks are stacked followed by last convolution layer that transforms the feature map into input dimension. Every convolution layer preserves resolution with zero padding. In total, there are 40 convolution layers. The number of convolution layers at each scale level is determined so that total model should have 120 convolution layers.

i understand from it that if i have scales = 2, so scale1 have 80 conv layers and scale0 have 40. and if scales = 1 so scale0 have 120 conv layers. when i look in your code it seems that no matter how much scales there are in the net, in every scale you have 40 conv layers. so if scales =1 the net have 40 conv layers,scales = 2 the net have 80 conv layers etc.. did i get it right?

Thanks you very much! Sivan

SeungjunNah commented 6 years ago

Oh, 40 conv layers per scale is simply default optional input argument for 3 scales. For example, 2 scale network should have 60 conv layers per level. The proper inputs for the paper experiments should be: -scale_levels 1 -nlayers 120 OR -scale_levels 2 -nlayers 60 OR -scale_levels 3 -nlayers 40 However, a smaller number of conv layers should be fine except for some performance decrease.

ghost commented 6 years ago

thanks