Open tengqi200 opened 1 year ago
Good news! When I increased the number of iterations, the performance of the hyperprior version exceeded that of the factorized version. But after rounding, z would become all zero. I don't know how to solve this problem.
Hi.
I extended this code using the Laplace conditional entropy model, which is used in the PCGCv1 pytorch version. My hyperprior network is similar to the encoder and the decoder, as well as the loss function (such as the multiscale BCE loss mentioned in the paper). In my experiments, the number of channels for y and z is 8 and 4, respectively.
The paper for this code said that the hyper-prior version could achieve -4.14% BD-rate gains compared with the factorized prior version. However, my hyperprior version seems difficult to outperform the factorized prior version.
If I adjust the scale bound from 1e-9 to 0.11, the performance of my hyperprior version can be better, but it still cannot outperform the factorized prior version.
This situation makes me confused. I want to know where the problem lies.
Thank you.