limbee / NTIRE2017

Torch implementation of "Enhanced Deep Residual Networks for Single Image Super-Resolution"
651 stars 146 forks source link

Recurrence the model using pytorch #17

Closed luomomo closed 6 years ago

luomomo commented 6 years ago

Hi, I am trying to recurrence baseline model x2 in pytorch. I think I create my model totally the same to the paper and baseline.lua. I convert your model baseline.x2.t7 to baseline.x2.pth and found that the parameters are different to my pytorch model's. Did you add 2 more conv() in your training?Or is there someting wrong in my steps?

my baseline params size baseline origin params size
conv_in.weight 64x3x3x3 0.weight 64x3x3x3
xxx xxx 1.weight 64x64x3x3
residual.layer.0.conv1.weight 64x64x3x3 2.0.0.0.0.0.0.weight 64x64x3x3
residual.layer.0.conv2.weight 64x64x3x3 2.0.0.0.0.0.2.weight 64x64x3x3
esidual.layer.1.conv1.weight 64x64x3x3 2.0.0.1.0.0.0.weight 64x64x3x3
residual.layer.1.conv2.weight 64x64x3x3 2.0.0.1.0.0.2.weight 64x64x3x3
…… …… …… ……
residual.layer.15.conv2.weight 64x64x3x3 2.0.0.15.0.0.2.weight 64x64x3x3
conv_mid.weight 64x64x3x3 2.0.0.16.weight 64x64x3x3
conv3.weight 256x64x3x3 3.0.weight 256x64x3x3
conv_out.weight 3x64x3x3 4.weight 3x64x3x3
xxx xxx 5.weight 3x3x1x1
sanghyun-son commented 6 years ago

Hello.

xxx in the last row is just a mean-shifting module (It only has biases, and convolution kernel is a identity kernel.)

However, I have no idea where the xxx from the second row came from.

If you let me know how did you convert our baseline model to pytorch model, I will try and give you an answer.

Thank you.


Also, we have a pytorch version of EDSR, too (This is an unofficial version.)

For some reason, it is difficult to open the entire pytorch code, but I think I can share the model code.

If you want that code, please tell me.

luomomo commented 6 years ago

Here is the link convert link: https://github.com/clcarwin/convert_torch_to_pytorch And there is the model when nResblock=2 baseline ( (conv_in): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (residual): NetworkBlock ( (layer): Sequential ( (0): BasicBlock ( (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (relu): ReLU (inplace) (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) ) (1): BasicBlock ( (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (relu): ReLU (inplace) (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) ) ) ) (conv_mid): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (conv3): Conv2d(64, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (shuffle): PixelShuffle (upscale_factor=2) (conv_out): Conv2d(64, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) )

luomomo commented 6 years ago

That is great if you can share the model code.

sanghyun-son commented 6 years ago

I checked the converting code, and found nothing wrong.

I think it is much better for you to use my code.

Here is a link for the pytorch version of EDSR

https://drive.google.com/open?id=0B_riD-4WK4WwMTNEQTFNdjBJUVE

luomomo commented 6 years ago

I finally know the difference! Your have two meanShift modules which lead to two more conv weights and bias in the model parameters.