Differences of reconstruction results between forward() and compress()

InterDigitalInc / CompressAI

A PyTorch library and evaluation platform for end-to-end compression research

https://interdigitalinc.github.io/CompressAI/

BSD 3-Clause Clear License

1.19k stars 232 forks source link

Differences of reconstruction results between forward() and compress() #41

Closed JPEG2021 closed 3 years ago

JPEG2021 commented 3 years ago

I found that there were differences of reconstructed images between forwarding and real compression when using pretrained mbt2018 model provided by the library. For example, when I forwarded kodak 19 image to pretrained mbt2018 model of quality 1, I got the result of following (rounded):

Bit-rate: 0.0905 bpp
PSNR: 28.0561dB
MS-SSIM: 0.9084

However, when I really compressed it, the reconstructed image was different from the above. Please notice psnr and ms-ssim values.

$ python3 -m compressai.utils.eval_model pretrained -a mbt2018 -q 1 -m mse ./kodak_19/

{
  "name": "mbt2018",
  "description": "Inference (ans)",
  "results": {
    "psnr": [
      28.058247955168817
    ],
    "ms-ssim": [
      0.9069046378135681
    ],
    "bpp": [
      0.091796875
    ],
    "encoding_time": [
      5.185348987579346
    ],
    "decoding_time": [
      10.088664054870605
    ]
  }
}

Other models like mbt2018_mean was fine. Is it a bug? Thanks.

jbegaint commented 3 years ago

yes, of course differences are expected. The PSNR and the bpp are pretty close still. When compressing/decompressing the images the model is ran in evaluation mode: the latent values are quantized (hard rounding) and the entropy coder uses quantized CDF tables.

When inferring the model with forward(), the entropy is estimated from the learned distributions (entropy bottleneck) and gaussian parameters (mean/scale).

It's likely the difference is a bit higher for this model compare to the other ones due to the auto-regressive nature of the decoding.

JPEG2021 commented 3 years ago

Thanks for the answer. However, I ran the mbt2018 model with foward() in the eval mode, which would use the same quantization method (rounding) as real compression, hence same quantized latent representations are computed. Then, I think same PSNR and MS-SSIM are expected except the bpp. Is different reconstruction when forward() and compress() a characteristic of the auto-regressive manner? If yes, in auto-regressive manner, maybe does the quantized CDF tables that entropy coder have more errors than others? I want to understand the exact reason. Thanks!

jbegaint commented 3 years ago

As often the answers are in the code ;-) See https://github.com/InterDigitalInc/CompressAI/blob/master/compressai/models/priors.py#L463 and https://github.com/InterDigitalInc/CompressAI/blob/master/compressai/models/priors.py#L538. The slight performance difference lies in the way the y elements are quantized: in compress we use the predicted means as offsets before/after rounding, during training we just use rounding.

Closing this now as this is not a bug and we'd like to use issues to address bugs only, please open a discussion thread if you have more questions.

JPEG2021 commented 3 years ago

I see. Thank you for the comments.