My hyperprior version seems difficult to outperform the factorized prior version.

Hi.

I extended this code using the Laplace conditional entropy model, which is used in the PCGCv1 pytorch version. My hyperprior network is similar to the encoder and the decoder, as well as the loss function (such as the multiscale BCE loss mentioned in the paper). In my experiments, the number of channels for y and z is 8 and 4, respectively.

The paper for this code said that the hyper-prior version could achieve -4.14% BD-rate gains compared with the factorized prior version. However, my hyperprior version seems difficult to outperform the factorized prior version.

If I adjust the scale bound from 1e-9 to 0.11, the performance of my hyperprior version can be better, but it still cannot outperform the factorized prior version.

This situation makes me confused. I want to know where the problem lies.

Thank you.

NJUVISION / PCGCv2

My hyperprior version seems difficult to outperform the factorized prior version. #8