RuntimeError: Given groups=1, weight of size [64, 22, 4, 4], expected input[4, 24, 256, 192] to have 22 channels, but got 24 channels instead

jakubLangr commented 4 years ago

When running the model on my own data and making the change as in #7 (permuting and unpermuting size dimension for im and im_h) I get the following error:

Traceback (most recent call last):
  File "test.py", line 163, in <module>
    main()
  File "test.py", line 151, in main
    test_gmm(opt, train_loader, model, board)
  File "test.py", line 69, in test_gmm
    grid, theta = model(agnostic, c)
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/efs/experiments/lucas_gans/cp-vton/networks.py", line 416, in forward
    featureA = self.extractionA(inputA)
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/efs/experiments/lucas_gans/cp-vton/networks.py", line 73, in forward
    return self.model(x)
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward
    input = module(input)
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 320, in forward
    self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight of size [64, 22, 4, 4], expected input[4, 24, 256, 192] to have 22 channels, but got 24 channels instead

This was run after successfully running the model in the first instance. The model currently looks as follows:

Sequential(
  (0): Conv2d(22, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
  (1): ReLU(inplace)
  (2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (3): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
  (4): ReLU(inplace)
  (5): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (6): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
  (7): ReLU(inplace)
  (8): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (9): Conv2d(256, 512, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
  (10): ReLU(inplace)
  (11): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (12): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (13): ReLU(inplace)
  (14): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (15): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (16): ReLU(inplace)
)

The weight of size [64, 22, 4, 4] seems predefined by:

self.extractionA = FeatureExtraction(22, ngf=64, n_layers=3, norm_layer=nn.
self.extractionB = FeatureExtraction(3, ngf=64, n_layers=3, norm_layer=nn.BatchNorm2d)

The input expected input[4, 24, 256, 192] to have 22 channels comes from the data size, which for some reason is 2 channels larger?

solitarysandman commented 4 years ago

Your agnostic has incorrect shape. It should have 22 channels but seems to have 24. Can you print its shape?

NguyenTriTrinh commented 4 years ago

i encounter the same problem as you,how did you fix it?

jakubLangr commented 4 years ago

It has been a while, but AFAIK, it was a case of flattening one of the semantic maps.

Hope it helps!

bhavyashahh commented 4 years ago

can you expand on solving this error?

NguyenTriTrinh commented 4 years ago

@bhavyashahh ,I solved it by adding a line after https://github.com/sergeywong/cp-vton/blob/fbaf333ba63ddb9c9379e5689241b122f5cf621d/cp_dataset.py#L71 im_parse = im_parse.convert('P') or maybe you can print the shape of every component compared with the input image proposed by author. Hope it helps you.

sergeywong / cp-vton

RuntimeError: Given groups=1, weight of size [64, 22, 4, 4], expected input[4, 24, 256, 192] to have 22 channels, but got 24 channels instead #24