InterDigitalInc / CompressAI

A PyTorch library and evaluation platform for end-to-end compression research
https://interdigitalinc.github.io/CompressAI/
BSD 3-Clause Clear License
1.15k stars 228 forks source link

got Runtime ER #201

Closed sddai closed 1 year ago

sddai commented 1 year ago

i gave the mbt2018 pre-trained model a jpeg file and expected the corresponding output. however, it says that "Sizes of tensors must match except in dimension 1."

how could i fix it?

from compressai.zoo import mbt2018
net = mbt2018(quality=2, pretrained=True).eval().to(device)
img = Image.open('../samples/3QHCA5545100F0F_1666577425821.jpg').convert('RGB')
x = transforms.ToTensor()(img).unsqueeze(0).to(device)
with torch.no_grad():
    out_net = net.forward(x)
out_net['x_hat'].clamp_(0, 1)
print(out_net.keys())

here's the output error:

`RuntimeError                              Traceback (most recent call last)
Cell In[15], line 2
      1 with torch.no_grad():
----> 2     out_net = net.forward(x)
      3 out_net['x_hat'].clamp_(0, 1)
      4 print(out_net.keys())

File ~/miniconda3/lib/python3.8/site-packages/compressai/models/google.py:545, in JointAutoregressiveHierarchicalPriors.forward(self, x)
    540 y_hat = self.gaussian_conditional.quantize(
    541     y, "noise" if self.training else "dequantize"
    542 )
    543 ctx_params = self.context_prediction(y_hat)
    544 gaussian_params = self.entropy_parameters(
--> 545     torch.cat((params, ctx_params), dim=1)
    546 )
    547 scales_hat, means_hat = gaussian_params.chunk(2, 1)
    548 _, y_likelihoods = self.gaussian_conditional(y, scales_hat, means=means_hat)

RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 92 but got size 90 for tensor number 1 in the list.

`

YodaEmbedding commented 1 year ago

Works for me.

>>> from compressai.zoo import mbt2018
... from PIL import Image
... from torchvision import transforms
... 
... device = "cuda"
... net = mbt2018(quality=2, pretrained=True).eval().to(device)
... img = Image.open("datasets/kodak/test/kodim01.png").convert("RGB")
... x = transforms.ToTensor()(img).unsqueeze(0).to(device)
... 
... with torch.no_grad():
...     out_net = net(x)  # usually the same as net.forward(x)
... x_hat = out_net['x_hat'].clamp_(0, 1)
... psnr = -10 * ((x_hat - x)**2).mean().log10()

>>> x.shape
torch.Size([1, 3, 512, 768])

>>> x.dtype
torch.float32

>>> x.device
device(type='cuda', index=0)

>>> psnr
tensor(27.0906, device='cuda:0')
fracape commented 1 year ago

Hi,

You may be using an image resolution that's not a multiple of 64, which is the case in @YodaEmbedding's example.

Please look at the padding performed in codec.py or in the evaluation tool in utils

sddai commented 1 year ago

@fracape @YodaEmbedding Thank you very much.