Zj-BinXia / AMSA

This project is the official implementation of 'Coarse-to-Fine Embedded PatchMatch and Multi-Scale Dynamic Aggregation for Reference-based Super-Resolution', AAAI2022
69 stars 5 forks source link

Why is the output image size of the test different from the input? #10

Open Lkinyuu opened 1 year ago

Lkinyuu commented 1 year ago

Thank you for your contribution. I am confused why the size of the output result and the input image are different, which is not conducive to the calculation of evaluation metrics such as PI, for example. What modifications should I make if I want to output the same size, thanks!

Zj-BinXia commented 1 year ago

Yes, that's because we need to 8x downsample input images to get images In different scales to achieve scale robustness. So, It seems that codes crop the image size to multiples of 8. This trick is commonly used in the Unet shape networks. If you want to keep the input image size, you can choose to pad the input image to multiples of 8 and crop the padding of the output. You can refer following codes:

`mod_pad_h, mod_pad_w = 0, 0

window_size=8

    _, _, h, w = lq.size()

    if h % window_size != 0:

        mod_pad_h = window_size - h % window_size

    if w % window_size != 0:

        mod_pad_w = window_size - w % window_size

    img = F.pad(lq, (0, mod_pad_w, 0, mod_pad_h), 'reflect')

` Then you can take img as input of the network, and obtain output. And crop the padding as follows:

`, , h, w = self.output.size()

scale=4

   output = self.output[:, :, 0:h - mod_pad_h * scale, 0:w - mod_pad_w * scale]

`

Lkinyuu commented 1 year ago

Yes, that's because we need to 8x downsample input images to get images In different scales to achieve scale robustness. So, It seems that codes crop the image size to multiples of 8. This trick is commonly used in the Unet shape networks. If you want to keep the input image size, you can choose to pad the input image to multiples of 8 and crop the padding of the output. You can refer following codes:

`mod_pad_h, mod_pad_w = 0, 0

window_size=8

    _, _, h, w = lq.size()

    if h % window_size != 0:

        mod_pad_h = window_size - h % window_size

    if w % window_size != 0:

        mod_pad_w = window_size - w % window_size

    img = F.pad(lq, (0, mod_pad_w, 0, mod_pad_h), 'reflect')

` Then you can take img as input of the network, and obtain output. And crop the padding as follows:

`, , h, w = self.output.size()

scale=4

   output = self.output[:, :, 0:h - mod_pad_h * scale, 0:w - mod_pad_w * scale]

`

I am afraid that such a step will not work. In the previous work, the original input image has been made to perform a (scale*8) calculation, and the pad operation with a multiplier of 8 on the input image will naturally be included in the above multiplier.

Lkinyuu commented 1 year ago

I think I solved it, thanks for the answer. I changed the rounding down in mod_crop to rounding up

Lkinyuu commented 1 year ago

By the way, when I test with gan.pth everything works fine, but when I test with mse.pth it suggests that the weights including dcn are missing.

Lkinyuu commented 1 year ago

6ba188ded4faea9e9a19a46039c92d9 @Zj-BinXia

Zj-BinXia commented 1 year ago

Can you download the checkpoint from google drive again? It may be the problem of downloading

Zj-BinXia commented 1 year ago

If you use our original code and checkpoint, you can run AMSA-mse and AMSA-GAN successfully

Lkinyuu commented 1 year ago

I am very sure that my download was correct, because after this problem I downloaded again accordingly

Zj-BinXia commented 1 year ago

I do not what's the problem because I can load the checkpoint. Can you use my original code and test it?

using command "python mmsr/test.py -opt "options/test/test_AMSA_mse.yml""?

Zj-BinXia commented 1 year ago

And do you use pytorch=1.4 and install DNCV2?

`python setup.py develop

cd mmsr/models/archs/DCNv2

python setup.py build develop`