Hello. I ask you something about value error

rusher2020 commented 2 years ago

I performed training after converting tfrecord type data, and a problem occurred in the process of checking the deblur effect of a single image through "demo.py".

I used the "model.png" image in the existing folder ./imge as the input image, and the following value error occurred. Can you give me some advice on how to solve it?

Traceback (most recent call last):
  File "./demo.py", line 26, in <module>
    img_yiq[..., 0] = g
ValueError: could not broadcast input array from shape (398,886) into shape (397,885)

The code I used is below.

import cv2
import torch
from skimage.color import rgb2yiq, yiq2rgb
import utils
from model import *

bgan = BlurGAN_G().cuda() 
bgan_state_dict = torch.load("./bgan_pretrain.pth")
bgan.load_state_dict(bgan_state_dict)
bgan.eval()

img = cv2.imread("img/model.png")
img_yiq = rgb2yiq(img)
img = torch.from_numpy(img.transpose(2, 0, 1)).unsqueeze(0).float().to("cuda")
n, c, h, w = img.shape
img_noise = utils.concat_noise(img, (4, h, w), img.size()[0])
img_bgan = bgan(img_noise)[0].detach().cpu().numpy()
g = utils.lum(img_bgan)
img_yiq[..., 0] = g
img_blur = yiq2rgb(img_yiq)

cv2.imwrite("img/model_blur.png", img_blur)

jkhu29 commented 2 years ago

I also noticed this problem a few days ago. The reason for this bug is that there are some problems with the model structure that cause it to temporarily not support shapes like (398, 886) that are not divisible by 4. One of the easiest ways is to resize the image to (400, 888). Or you can try modifying this part of to

nn.Conv2d(out_channels, in_channels, k channels, in_channels, kernel_size=7, padding=3, bias=True)

rusher2020 commented 2 years ago

Thanks for your answer! As you mentioned, I was able to solve the problem by adjusting the image size.

Have you ever encountered an occasional runtime error as shown below?

Traceback (most recent call last):
  File "./demo.py", line 31, in <module>
    img_dbgan = dbgan(torch.from_numpy(img_bgan.transpose(2, 0, 1)).float().unsqueeze(0).to("cuda"))[0].detach().cpu().numpy()
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/drive/MyDrive/new/model.py", line 315, in forward
    x = self.conv1(x)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/container.py", line 141, in forward
    input = module(input)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/padding.py", line 174, in forward
    return F.pad(input, self.padding, 'reflect')
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py", line 4189, in _pad
    return torch._C._nn.reflection_pad2d(input, pad)
RuntimeError: Padding size should be less than the corresponding input dimension, but got: padding (3, 3) at dimension 2 of input [1, 840, 3, 360]

jkhu29 commented 2 years ago

You can see the shape of your input is [1, 840, 3, 360]. Actually, the shape of input should be n, c, h, w in PyTorch, "n" means batch size, "c" means channels, "h" and "w" mean the height and the width of your image. I think that the 3 should be channels, your input's shape maybe n, h, c, w or n, w, c, h. That is not correct. See the line31

img_dbgan = dbgan(torch.from_numpy(img_bgan.transpose(2, 0, 1)).float().unsqueeze(0).to("cuda"))[0].detach().cpu().numpy()

.transpose(2, 0, 1) may not be necessary in your case.

rusher2020 commented 2 years ago

I understood, thanks a lot !:)

jkhu29 / Deblurring-by-Realistic-Blurring

Hello. I ask you something about value error #4