InterDigitalInc / CompressAI

A PyTorch library and evaluation platform for end-to-end compression research
https://interdigitalinc.github.io/CompressAI/
BSD 3-Clause Clear License
1.19k stars 232 forks source link

Entropy model of hyper-latents #155

Closed hyeseojy closed 2 years ago

hyeseojy commented 2 years ago

Hi,

What should I do if I want to use zero-mean gaussian Entropy Parameters for hyper-latents(like in y)? To do that, Can I use the self.gaussian_conditional function as it is?

For example, if I use ScaleHyperprior model, the code below. (It's all the same as the existing code, and I only added the scale_z part.)

class ScaleHyperprior(CompressionModel):

    def __init__(self, N, M, **kwargs):
        super().__init__(entropy_bottleneck_channels=N, **kwargs)

        self.g_a = nn.Sequential(
            conv(3, N),
            GDN(N),
            conv(N, N),
            GDN(N),
            conv(N, N),
            GDN(N),
            conv(N, M),
        )

        self.g_s = nn.Sequential(
            deconv(M, N),
            GDN(N, inverse=True),
            deconv(N, N),
            GDN(N, inverse=True),
            deconv(N, N),
            GDN(N, inverse=True),
            deconv(N, 3),
        )

        self.h_a = nn.Sequential(
            conv(M, N, stride=1, kernel_size=3),
            nn.ReLU(inplace=True),
            conv(N, N),
            nn.ReLU(inplace=True),
            conv(N, N),
        )

        self.h_s = nn.Sequential(
            deconv(N, N),
            nn.ReLU(inplace=True),
            deconv(N, N),
            nn.ReLU(inplace=True),
            conv(N, M, stride=1, kernel_size=3),
            nn.ReLU(inplace=True),
        )

        self.gaussian_conditional = GaussianConditional(None)
        self.N = int(N)
        self.M = int(M)

    @property
    def downsampling_factor(self) -> int:
        return 2 ** (4 + 2)

    def forward(self, x):
        y = self.g_a(x)
        z = self.h_a(torch.abs(y))
        # z_hat, z_likelihoods = self.entropy_bottleneck(z)
        scale_z = torch.abs(torch.ones(z.shape, device='cuda'))
        z_hat, z_likelihoods = self.gaussian_conditional(z, scale_z)
        scales_hat = self.h_s(z_hat)
        y_hat, y_likelihoods = self.gaussian_conditional(y, scales_hat)
        x_hat = self.g_s(y_hat)

        return {
            "x_hat": x_hat,
            "likelihoods": {"y": y_likelihoods, "z": z_likelihoods},
        }

If possible, we should revise def forward, def compress, etc.. I think if this is possible, we can use it in the same way as y...

Is the above method valid? (If the above code is used for training, execution is possible.) Or I would appreciate it if you could let me know the link to refer to.

I'm sorry it's a stupid question that's hard to answer. Thanks.

fracape commented 2 years ago

hi @hyeseojy, You are proposing a different architecture than the paper we refer to here. Have you trained it and compare results? I think it's more a "discussion" post than an issue. To be honest, I think you should modify your entropy bottleneck instead of using gaussian_conditional, since you actually don't condition your entropy models based on another prior like for gaussian_conditional(y, scales_hat). Please let us know in discussion if your model works and outperforms this one. If it's published, we could add it. Thanks