Recurrence the model using pytorch

luomomo commented 6 years ago

Hi, I am trying to recurrence baseline model x2 in pytorch. I think I create my model totally the same to the paper and baseline.lua. I convert your model baseline.x2.t7 to baseline.x2.pth and found that the parameters are different to my pytorch model's. Did you add 2 more conv() in your training?Or is there someting wrong in my steps?

my baseline params	size	baseline origin params	size
conv_in.weight	64x3x3x3	0.weight	64x3x3x3
xxx	xxx	1.weight	64x64x3x3
residual.layer.0.conv1.weight	64x64x3x3	2.0.0.0.0.0.0.weight	64x64x3x3
residual.layer.0.conv2.weight	64x64x3x3	2.0.0.0.0.0.2.weight	64x64x3x3
esidual.layer.1.conv1.weight	64x64x3x3	2.0.0.1.0.0.0.weight	64x64x3x3
residual.layer.1.conv2.weight	64x64x3x3	2.0.0.1.0.0.2.weight	64x64x3x3
……	……	……	……
residual.layer.15.conv2.weight	64x64x3x3	2.0.0.15.0.0.2.weight	64x64x3x3
conv_mid.weight	64x64x3x3	2.0.0.16.weight	64x64x3x3
conv3.weight	256x64x3x3	3.0.weight	256x64x3x3
conv_out.weight	3x64x3x3	4.weight	3x64x3x3
xxx	xxx	5.weight	3x3x1x1

sanghyun-son commented 6 years ago

Hello.

xxx in the last row is just a mean-shifting module (It only has biases, and convolution kernel is a identity kernel.)

However, I have no idea where the xxx from the second row came from.

If you let me know how did you convert our baseline model to pytorch model, I will try and give you an answer.

Thank you.

Also, we have a pytorch version of EDSR, too (This is an unofficial version.)

For some reason, it is difficult to open the entire pytorch code, but I think I can share the model code.

If you want that code, please tell me.

luomomo commented 6 years ago

Here is the link convert link: https://github.com/clcarwin/convert_torch_to_pytorch And there is the model when nResblock=2 baseline ( (conv_in): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (residual): NetworkBlock ( (layer): Sequential ( (0): BasicBlock ( (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (relu): ReLU (inplace) (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) ) (1): BasicBlock ( (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (relu): ReLU (inplace) (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) ) ) ) (conv_mid): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (conv3): Conv2d(64, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (shuffle): PixelShuffle (upscale_factor=2) (conv_out): Conv2d(64, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) )

luomomo commented 6 years ago

That is great if you can share the model code.

sanghyun-son commented 6 years ago

I checked the converting code, and found nothing wrong.

I think it is much better for you to use my code.

Here is a link for the pytorch version of EDSR

https://drive.google.com/open?id=0B_riD-4WK4WwMTNEQTFNdjBJUVE

luomomo commented 6 years ago

I finally know the difference! Your have two meanShift modules which lead to two more conv weights and bias in the model parameters.

limbee / NTIRE2017

Recurrence the model using pytorch #17