Closed mmSir closed 3 years ago
@jbegaint Sorry for the late delivery of more detailed information. I found the problem is in the forward function. If i fed quantized latent y to the Hyperprior net, i would get different bpp between forward() extimation and compress() actual entropy coding. Here are the codes for reproduct my description issue.
def forward(self, x):
y = self.g_a(x)
y = self.gaussian_conditional.quantize(y, "noise") # quantized latent y
z = self.h_a(y)
z_hat, z_likelihoods = self.entropy_bottleneck(z)
gaussian_params = self.h_s(z_hat)
scales_hat, means_hat = gaussian_params.chunk(2, 1)
y_hat, y_likelihoods = self.gaussian_conditional(y, scales_hat, means=means_hat)
x_hat = self.g_s(y_hat)
return {
"y": y,
"y_hat": y_hat,
"x_hat": x_hat,
"likelihoods": {"y": y_likelihoods, "z": z_likelihoods},
}
def compress(self, x):
y = self.g_a(x)
y = self.gaussian_conditional.quantize(y, "symbols") # quantized latent y
z = self.h_a(y)
z_strings = self.entropy_bottleneck.compress(z)
z_hat = self.entropy_bottleneck.decompress(z_strings, z.size()[-2:])
gaussian_params = self.h_s(z_hat)
scales_hat, means_hat = gaussian_params.chunk(2, 1)
indexes = self.gaussian_conditional.build_indexes(scales_hat)
y_strings = self.gaussian_conditional.compress(y, indexes, means=means_hat) # y_string 是 list, 且只包含一个元素
return {"strings": [y_strings, z_strings], "shape": z.size()[-2:]}`
Indeed, Balle doesn't do so in the paper (Variational image compression with a scale hyperprior) . But some paper fed the quantized latent to hyperprior net, such as Asymmetric_Gained_Deep_Image_Compression_With_Continuous_Rate_Adaptation - 2021 CVPR , SpatioTemporal Entropy Model is All You Need For LVC . In my understanding, this detail should not affect the final result too much. I just wonder why this would affect the bpp estimation?
Hi, you probably have a bug somewhere in your entropy modelling with possibly a mix up with quantized/noisy/original tensors. If you can pinpoint an issue with the GaussianConditional implementation, I can take a look.
Thanks for the answer! I'll check my code. :)
I create a new entropy model inherited from CompressionModel, similar with MeanScaleHyperprior , which contains an EntropyBottleneck used for modeling side information z and a GaussianConditional used for modeling image latent feature y. I also train a MeanScaleHyperprior model in quality 4, and only use it to generate image latent feature y for my new entropy model.
When I evaluate the MeanScaleHyperprior model, I can get almost the same bpp using model.forward() or model.compress(). But I get different results when evaluate my new entropy model.
Here are the codes.
I use some frames from HEVC Standard Test Sequence B to test and found estimate_z_bpp is quite close to actual_z_bpp while estimate_y_bpp is half to actual_y_bpp. So there is no problem with the EntropyBottleneck but something wrong with the GaussianConditional. I have no idea about it.
Looking forward to your help! Thanks!