zwx8981 / LIQE

[CVPR2023] Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective
MIT License
197 stars 11 forks source link

Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [15, 468, 3, 224, 224] #29

Closed JeffreyXy closed 3 weeks ago

JeffreyXy commented 1 month ago

Thanks for your job, when I run demo2.py, I meet this problem:

RuntimeError Traceback (most recent call last) Cell In[1], line 35 33 print('###Preprocessing###') 34 with torch.no_grad(): ---> 35 q1, s1, d1 = model(I1) 36 q2, s2, d2 = model(I2) 38 print('Image #1 is a photo of {} with {} artifacts, which has a perceptual quality of {} as quantified by LIQE'. 39 format(s1, d1, q1.item()))

File /data/anaconda3/envs/jeffrey_clip/lib/python3.10/site-packages/torch/nn/modules/module.py:1130, in Module._call_impl(self, *input, *kwargs) 1126 # If we don't have any hooks, we want to skip the rest of the logic in 1127 # this function, and just call forward. 1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks 1129 or _global_forward_hooks or _global_forward_pre_hooks): -> 1130 return forward_call(input, **kwargs) 1131 # Do not call functions when jit is used 1132 full_backward_hooks, non_full_backward_hooks = [], []

File /data/xinyu.gu/LIQE/LIQE.py:81, in LIQE.forward(self, x) 77 x = x[sel, ...] 78 #num_patch = x.size(1) 79 #x = x.view(-1, x.size(2), x.size(3), x.size(4)) ---> 81 image_features = self.model.encode_image(x) 83 # normalized features 84 image_features = image_features / image_features.norm(dim=1, keepdim=True)

File /data/anaconda3/envs/jeffrey_clip/lib/python3.10/site-packages/clip/model.py:341, in CLIP.encode_image(self, image) 340 def encode_image(self, image): --> 341 return self.visual(image.type(self.dtype))

File /data/anaconda3/envs/jeffrey_clip/lib/python3.10/site-packages/torch/nn/modules/module.py:1130, in Module._call_impl(self, *input, *kwargs) 1126 # If we don't have any hooks, we want to skip the rest of the logic in 1127 # this function, and just call forward. 1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks 1129 or _global_forward_hooks or _global_forward_pre_hooks): -> 1130 return forward_call(input, **kwargs) 1131 # Do not call functions when jit is used 1132 full_backward_hooks, non_full_backward_hooks = [], []

File /data/anaconda3/envs/jeffrey_clip/lib/python3.10/site-packages/clip/model.py:224, in VisionTransformer.forward(self, x) 223 def forward(self, x: torch.Tensor): --> 224 x = self.conv1(x) # shape = [, width, grid, grid] 225 x = x.reshape(x.shape[0], x.shape[1], -1) # shape = [, width, grid * 2] 226 x = x.permute(0, 2, 1) # shape = [, grid ** 2, width]

File /data/anaconda3/envs/jeffrey_clip/lib/python3.10/site-packages/torch/nn/modules/module.py:1130, in Module._call_impl(self, *input, *kwargs) 1126 # If we don't have any hooks, we want to skip the rest of the logic in 1127 # this function, and just call forward. 1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks 1129 or _global_forward_hooks or _global_forward_pre_hooks): -> 1130 return forward_call(input, **kwargs) 1131 # Do not call functions when jit is used 1132 full_backward_hooks, non_full_backward_hooks = [], []

File /data/anaconda3/envs/jeffrey_clip/lib/python3.10/site-packages/torch/nn/modules/conv.py:457, in Conv2d.forward(self, input) 456 def forward(self, input: Tensor) -> Tensor: --> 457 return self._conv_forward(input, self.weight, self.bias)

File /data/anaconda3/envs/jeffrey_clip/lib/python3.10/site-packages/torch/nn/modules/conv.py:453, in Conv2d._conv_forward(self, input, weight, bias) 449 if self.padding_mode != 'zeros': 450 return F.conv2d(F.pad(input, self._reversed_padding_repeated_twice, mode=self.padding_mode), 451 weight, bias, self.stride, 452 _pair(0), self.dilation, self.groups) --> 453 return F.conv2d(input, weight, bias, self.stride, 454 self.padding, self.dilation, self.groups)

RuntimeError: Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [15, 468, 3, 224, 224]

zwx8981 commented 1 month ago

Thanks for your reply. I have made some changes, you may try it again.

JeffreyXy commented 1 month ago

Thanks a lot, it works! btw, it seems lack of "self." in num_patch,

x = x.reshape(batch_sizenum_patch, x.shape[2], x.shape[3], x.shape[4]) -> x = x.reshape(batch_sizeself.num_patch, x.shape[2], x.shape[3], x.shape[4])

beibei987 commented 3 weeks ago

Hello, I am also experiencing the same problem, can you tell us more about the solution?

zwx8981 commented 3 weeks ago

@beibei987 Hi, I think this should has been solved in the latest version.

beibei987 commented 3 weeks ago

@zwx8981 ok,thank you very much