Verg-Avesta / CounTR

CounTR: Transformer-based Generalised Visual Counting
https://verg-avesta.github.io/CounTR_Webpage/
MIT License
92 stars 9 forks source link

shape mismatch error #34

Open Chen94yue opened 1 year ago

Chen94yue commented 1 year ago

I run the demo with my test data.

cy@cy-MS-7D40:~/export/CounTR$ python3 demo.py 
^[[CResume checkpoint ./ckpt/FSC147.pth
/home/cy/.local/lib/python3.10/site-packages/torchvision/transforms/functional.py:1603: UserWarning: The default value of the antialias parameter of all the resizing transforms (Resize(), RandomResizedCrop(), etc.) will change from None to True in v0.17, in order to be consistent across the PIL and Tensor backends. To suppress this warning, directly pass antialias=True (recommended, future default), antialias=None (current default, which means False for Tensors and True for PIL), or antialias=False (only works on Tensors - PIL will still use antialiasing). This also applies if you are using the inference transforms from the models weights: update the call to weights.transforms(antialias=True).
  warnings.warn(
Traceback (most recent call last):
  File "/home/cy/export/CounTR/demo.py", line 212, in <module>
    result, elapsed_time = run_one_image(samples, boxes, pos, model)
  File "/home/cy/export/CounTR/demo.py", line 143, in run_one_image
    output, = model(samples[:, :, :, start:start + 384], boxes, 3)
  File "/home/cy/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/cy/export/CounTR/models_mae_cross.py", line 205, in forward
    latent = self.forward_encoder(imgs)
  File "/home/cy/export/CounTR/models_mae_cross.py", line 138, in forward_encoder
    x = self.patch_embed(x)
  File "/home/cy/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/cy/.local/lib/python3.10/site-packages/timm/models/layers/patch_embed.py", line 34, in forward
    x = self.proj(x).flatten(2).transpose(1, 2)
  File "/home/cy/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/cy/.local/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 463, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/home/cy/.local/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [768, 3, 16, 16], expected input[1, 4, 384, 384] to have 3 channels, but got 4 channels instead
Chen94yue commented 1 year ago
def load_image():
    # im_dir = '/GPFS/data/changliu/Dataset/FSC147/images_384_VarV2'
    # im_id = '222.jpg'

    # image = Image.open('{}/{}'.format(im_dir, im_id))
    image = Image.open("1690120123_color_副本.png")
    image.load()
    W, H = image.size

    # Resize the image size so that the height is 384
Chen94yue commented 1 year ago
  # Coordinates of the exemplar bound boxes
    # The left upper corner and the right lower corner
    # bboxes = [
    #     [[136, 98], [173, 127]],
    #     [[209, 125], [242, 150]],
    #     [[212, 168], [258, 200]]
    # ]
    bboxes = [[[140, 120], [200, 230]],
              [[250, 100], [360, 160]],
              [[358, 371], [471, 427]]]
    boxes = list()
    rects = list()