tamarott / SinGAN

Official pytorch implementation of the paper: "SinGAN: Learning a Generative Model from a Single Natural Image"
https://tamarott.github.io/SinGAN.htm
Other
3.31k stars 611 forks source link

single channel error (ValueError : the input array must be have a shape == (.., .., [..,] 3)), got (614, 989, 1)) #131

Open erik5022 opened 3 years ago

erik5022 commented 3 years ago

Sorry to bother you, but i have a problem about single channel image input. I did input one single channel image to main_train. But, An error like the title appeared. I entered the following command. (nc_im = 1) python main_train.py --input_dir Input/test --input_name test.JPG --nc_im 1

I debugged the code and found that the error occur at 'np2torch' function in 'imresize.py'. There are two 'np2torch' functions at the beginning of the main_train. And the input image passes through these two functions.

if I input 1ch image (--nc_im 1)

  1. np2torch() at functions.py, input shape : (614, 989) → output shape : (1, 1, 614, 989) --> (No ERROR at color.rgb2gray(x))
  2. torch2uint8() at imresize.py, input shape : (1, 1, 614, 989) → output shape (614, 989, 1)
  3. imresize_in() at imresize.py, input shape : (614, 989, 1) → output shape (156, 250, 1)
  4. np2torch() at imresize.py, input shape : (156, 250, 1) → ERROR at color.rgb2gray(x)

if I input 3ch image (--nc_im 1)

  1. np2torch() at functions.py, input shape : (614, 989, 3) → output shape : (1, 1, 614, 989) --> (No ERROR at color.rgb2gray(x))
  2. torch2uint8() at imresize.py, input shape : (1, 1, 614, 989) → output shape (614, 989, 1)
  3. imresize_in() at imresize.py, input shape : (614, 989, 1) → output shape (156, 250, 1)
  4. np2torch() at imresize.py, input shape : (156, 250, 1) → ERROR at color.rgb2gray(x)

if I input 3ch image (not --nc_im 1) : NORMAL CASE

  1. np2torch() at functions.py, input shape : (614, 989, 3) → output shape : (1, 3, 614, 989)
  2. torch2uint8() at imresize.py, input shape : (1, 3, 614, 989) → output shape (614, 989, 3)
  3. imresize_in() at imresize.py, input shape : (614, 989, 3) → output shape (156, 250, 3)
  4. np2torch() at imresize.py, input shape : (156, 250, 3) → output shape : (1, 3, 156, 250)

So I modified the 'np2torch()' code to make the input shape: (w, h, 1) → output shape (1, 1, w, h) this way.

But another error occured at train_single_scale() at training.py. RuntimeError: Given groups=1, weight of size 32 1 3 3, expected input[1, 3, 36, 51] to have 1 channels, but got 3 channels instead

I think my modifying work to 'np2torch()' is not good. Please let me know how to solve this problem. Thank you for your great work.

tamarott commented 3 years ago

Your modification sounds correct. Please also check if you need to change anything within the imresize function.

zhangkuncsdn commented 3 years ago

Sorry to bother you, but i have a problem about single channel image input. I did input one single channel image to main_train. But, An error like the title appeared. I entered the following command. (nc_im = 1) python main_train.py --input_dir Input/test --input_name test.JPG --nc_im 1

I debugged the code and found that the error occur at 'np2torch' function in 'imresize.py'. There are two 'np2torch' functions at the beginning of the main_train. And the input image passes through these two functions.

if I input 1ch image (--nc_im 1)

  1. np2torch() at functions.py, input shape : (614, 989) → output shape : (1, 1, 614, 989) --> (No ERROR at color.rgb2gray(x))
  2. torch2uint8() at imresize.py, input shape : (1, 1, 614, 989) → output shape (614, 989, 1)
  3. imresize_in() at imresize.py, input shape : (614, 989, 1) → output shape (156, 250, 1)
  4. np2torch() at imresize.py, input shape : (156, 250, 1) → ERROR at color.rgb2gray(x)

if I input 3ch image (--nc_im 1)

  1. np2torch() at functions.py, input shape : (614, 989, 3) → output shape : (1, 1, 614, 989) --> (No ERROR at color.rgb2gray(x))
  2. torch2uint8() at imresize.py, input shape : (1, 1, 614, 989) → output shape (614, 989, 1)
  3. imresize_in() at imresize.py, input shape : (614, 989, 1) → output shape (156, 250, 1)
  4. np2torch() at imresize.py, input shape : (156, 250, 1) → ERROR at color.rgb2gray(x)

if I input 3ch image (not --nc_im 1) : NORMAL CASE

  1. np2torch() at functions.py, input shape : (614, 989, 3) → output shape : (1, 3, 614, 989)
  2. torch2uint8() at imresize.py, input shape : (1, 3, 614, 989) → output shape (614, 989, 3)
  3. imresize_in() at imresize.py, input shape : (614, 989, 3) → output shape (156, 250, 3)
  4. np2torch() at imresize.py, input shape : (156, 250, 3) → output shape : (1, 3, 156, 250)

So I modified the 'np2torch()' code to make the input shape: (w, h, 1) → output shape (1, 1, w, h) this way.

But another error occured at train_single_scale() at training.py. RuntimeError: Given groups=1, weight of size 32 1 3 3, expected input[1, 3, 36, 51] to have 1 channels, but got 3 channels instead

I think my modifying work to 'np2torch()' is not good. Please let me know how to solve this problem. Thank you for your great work.

Have you modified it? I have the same problem now.

bomtorazek commented 3 years ago

Sorry to bother you, but i have a problem about single channel image input. I did input one single channel image to main_train. But, An error like the title appeared. I entered the following command. (nc_im = 1) python main_train.py --input_dir Input/test --input_name test.JPG --nc_im 1 I debugged the code and found that the error occur at 'np2torch' function in 'imresize.py'. There are two 'np2torch' functions at the beginning of the main_train. And the input image passes through these two functions. if I input 1ch image (--nc_im 1)

  1. np2torch() at functions.py, input shape : (614, 989) → output shape : (1, 1, 614, 989) --> (No ERROR at color.rgb2gray(x))
  2. torch2uint8() at imresize.py, input shape : (1, 1, 614, 989) → output shape (614, 989, 1)
  3. imresize_in() at imresize.py, input shape : (614, 989, 1) → output shape (156, 250, 1)
  4. np2torch() at imresize.py, input shape : (156, 250, 1) → ERROR at color.rgb2gray(x)

if I input 3ch image (--nc_im 1)

  1. np2torch() at functions.py, input shape : (614, 989, 3) → output shape : (1, 1, 614, 989) --> (No ERROR at color.rgb2gray(x))
  2. torch2uint8() at imresize.py, input shape : (1, 1, 614, 989) → output shape (614, 989, 1)
  3. imresize_in() at imresize.py, input shape : (614, 989, 1) → output shape (156, 250, 1)
  4. np2torch() at imresize.py, input shape : (156, 250, 1) → ERROR at color.rgb2gray(x)

if I input 3ch image (not --nc_im 1) : NORMAL CASE

  1. np2torch() at functions.py, input shape : (614, 989, 3) → output shape : (1, 3, 614, 989)
  2. torch2uint8() at imresize.py, input shape : (1, 3, 614, 989) → output shape (614, 989, 3)
  3. imresize_in() at imresize.py, input shape : (614, 989, 3) → output shape (156, 250, 3)
  4. np2torch() at imresize.py, input shape : (156, 250, 3) → output shape : (1, 3, 156, 250)

So I modified the 'np2torch()' code to make the input shape: (w, h, 1) → output shape (1, 1, w, h) this way. But another error occured at train_single_scale() at training.py. RuntimeError: Given groups=1, weight of size 32 1 3 3, expected input[1, 3, 36, 51] to have 1 channels, but got 3 channels instead I think my modifying work to 'np2torch()' is not good. Please let me know how to solve this problem. Thank you for your great work.

Have you modified it? I have the same problem now.

I'm not the one whom you asked, but I changed the function "np2torch" in functions.py just like below. `` def np2torch(x,opt):

if opt.nc_im == 3:

    if x.ndim ==2:

        x = np.expand_dims(x, axis = 2)

        x = np.concatenate([x,x,x], axis = 2)

    x = x[:,:,:,None] 

    x = x.transpose((3, 2, 0, 1))/255

``

GuillaumeSa commented 3 years ago

I also modified 'np2torch' (in functions.py and imresize.py), I got the same problem concerning expected channel numbers with grayscale images. First of all, here is my np2torch code :

def np2torch(x,opt):

if opt.nc_im == 3:
    if len(x.shape) == 2:
        x = color.gray2rgb(x)
    x = x[:,:,:,None]
    x = x.transpose((3, 2, 0, 1))
    x = (x - x.min())/(x.max() - x.min()) #added
else:
    if x.shape[-1] == 3:
        x = color.rgb2gray(x)
    if len(x.shape) == 2:
        x = x[:,:,None,None]
    else:
        x = x[:,:,:,None]
    x = x.transpose(3, 2, 0, 1)
    x = (x - x.min())/(x.max() - x.min()) #added
x = torch.from_numpy(x)
if not (opt.not_cuda):
    x = move_to_gpu(x)
x = x.type(torch.cuda.FloatTensor) if not(opt.not_cuda) else x.type(torch.FloatTensor)
x = norm(x)
return x

Secondly, I suppose opt.nc_im and opt.nc_z have to be equal : 3 for RGB images or 1 for grayscale images. As both are summed during single-scale training, the size should be the same, otherwise I don't understand what opt.nc_z is.

To finally solve it, you should modify noise size in training.py. For example, line 98 :

        z_opt = m_noise(z_opt.expand(1,3,opt.nzx,opt.nzy))

should be

        z_opt = m_noise(z_opt.expand(1,opt.nc_z,opt.nzx,opt.nzy))

There are other lines where you should modify that size again, like 100 or 232. You can do ctrl+f to look for "3" in every files and replace them with opt.nc_z where needed (not everywhere of course). For instance, you should modify dilate_mask from functions.py.