Closed Hanawh closed 3 years ago
Upper and lower are the estimated CDF bounds for the current input, from there the densities can be derived.
You can find more information in Ballé et al. papers: End-to-end Optimized Image Compression, Variational image compression with a scale hyperprior
The sign
is used for floating point optimization, this had been directly ported from the tensorflow/compression project, you can find some explanations in the comments.
Thanks for your reply!
@torch.jit.unused
def _likelihood(self, inputs: Tensor) -> Tensor:
half = float(0.5)
v0 = inputs - half
v1 = inputs + half
lower = self._logits_cumulative(v0, stop_gradient=False)
upper = self._logits_cumulative(v1, stop_gradient=False)
sign = -torch.sign(lower + upper)
sign = sign.detach()
likelihood = torch.abs(
torch.sigmoid(sign * upper) - torch.sigmoid(sign * lower)
)
return likelihood
Why not
@torch.jit.unused
def _likelihood(self, inputs: Tensor) -> Tensor:
half = float(0.5)
v0 = inputs - half
v1 = inputs + half
lower = self._logits_cumulative(v0, stop_gradient=False)
upper = self._logits_cumulative(v1, stop_gradient=False)
likelihood = torch.sigmoid(upper) - torch.sigmoid(lower)
return likelihood
I debug found
> ll = torch.sigmoid(upper) - torch.sigmoid(lower)
> likelihood.isclose(ll).all()
True
> (likelihood == ll).all()
False
Why?!
I am a beginner in image compression. I don’t understand why lower, upper and sign are calculated for this step.