Open achel-x opened 2 weeks ago
The Discretized Gaussian Mixture Likelihoods follows the equation in the paper:
In this equation, $\omega$ refers to the weights in the code:
https://github.com/InterDigitalInc/CompressAI/blob/743680befc146a6d8ee7840285584f2ce00c3732/compressai/entropy_models/entropy_models.py#L735-L751
Usually, the parameters of the latent codec distribution, including the weights, are the outputs of some neural networks.
You can slightly modify the network's output to obtain the weights.
Thanks for your kindly instruction.
I tried a modification with the issue in here.
https://github.com/InterDigitalInc/CompressAI/issues/289#issuecomment-2485716961
I have tried to make the same modification of using the GaussianMixtureConditional
as showed below
` class Cheng2020GMM(Cheng2020Anchor):
def __init__(self, N=192, **kwargs):
super().__init__(N=N, **kwargs)
self.K = 3 # for GMM
self.entropy_parameters = nn.Sequential(
nn.Conv2d(N * 12 // 3, N * 10 // 3, 1),
nn.LeakyReLU(inplace=True),
nn.Conv2d(N * 10 // 3, N * 8 // 3, 1),
nn.LeakyReLU(inplace=True),
# nn.Conv2d(N * 8 // 3, N * 6 // 3, 1),
nn.Conv2d(N * 8 // 3, N * 3 * self.K, 1),
)
self.gaussian_conditional = GaussianMixtureConditional(K=self.K)
def forward(self, x):
y = self.g_a(x)
z = self.h_a(y)
z_hat, z_likelihoods = self.entropy_bottleneck(z)
params = self.h_s(z_hat)
y_hat = self.gaussian_conditional.quantize(
y, "noise" if self.training else "dequantize"
)
ctx_params = self.context_prediction(y_hat)
gaussian_params = self.entropy_parameters(
torch.cat((params, ctx_params), dim=1)
)
# print(f"gaussian_params.shape is {gaussian_params.shape}") # [8, 1728, 16, 16]
# scales_hat, means_hat = gaussian_params.chunk(2, 1)
scales_hat, means_hat, weight_hat = gaussian_params.chunk(3, 1)
B, C, H, W = weight_hat.shape # C is M*K - M*3
weight_hat = nn.functional.softmax(weight_hat.reshape(B, 3, C//3, H, W), dim=1).reshape(B, C, H, W)
# _, y_likelihoods = self.gaussian_conditional(y, scales_hat, means=means_hat)
y_hat1, y_likelihoods = self.gaussian_conditional(y, scales_hat, means_hat, weights=weight_hat)
# x_hat = self.g_s(y_hat)
x_hat = self.g_s(y_hat1)
return {
"x_hat": x_hat,
"likelihoods": {"y": y_likelihoods, "z": z_likelihoods},
}
`
I compared the Cheng2020GMM
and Cheng2020Anchor
.
The results are confused
GMM
Anchor
The GMM is inferior to the anchor and I am unable to undertand it. If you have some insights here, please help me out at your convenience!
Thanks again for your valuable time.
Best wishes.
Hi! I have the same problem as well. Have you found a good solution yet?
Perhaps try using STE for quantization instead of noise.
Still, it's weird that GMM K=3 performs that much worse than GC. Try setting K=1 and training. Is the performance still worse?
When I run the Cheng2020 series, I noticed that the Cheng2020Anchor inherit from JointAutoregressiveHierarchicalPriors. While the `gaussian_conditional` model in `JointAutoregressiveHierarchicalPriors` is just GaussianConditional, it is not a gaussian mixture model.
I tried to add a sentence in Cheng2020Anchor
self.gaussian_conditional = GaussianMixtureConditional()
But it failed to run.
what does the weights mean? and what should i pass to the GaussianMixtureConditional()