Closed cucNaifuXue closed 1 year ago
aux_loss
adjusts only the EntropyBottleneck.quantiles
to ensure that the trained distribution fits a few target
boundary conditions at runtime. Nothing further. It also does not take any image data as input. It has no effect on the training of the model since EntropyBottleneck.quantiles
is not used for the model's forward pass during training.
Thus, I would say that minimizing aux_loss
as much as possible isn't critical since it has nearly no effect on the trained RD performance. All it does is make sure the support of the encoding distribution is finite, ensuring that we don't use too many symbols at runtime, and that symbols have some probability over a precision threshold. Potentially, it may also act as a small regularizer that keeps the distributions in check. The .quantiles
are not used at all during training, so aux_loss
is not relevant until after training finishes.
Related discussions:
it has nearly no effect on the trained RD performance.
It means user shouldn't adjust lr for aux optimizer according to some metrics when training model, really?
The only time the .quantiles
parameters are referenced during training is in _get_medians
, which references the .quantiles[:, :, 1]
(midpoints):
...which is only used to offset prior to adding noise:
The model should not be too heavily influenced by a near-constant "midpoint" bias.
Funnily enough:
aux_optimizer.step()
s.c(y)
:
c(q_0) = tail_mass
c(q_1) = 0.5
c(q_2) = 1 - tail_mass
Hi,
I am confused about the 'Auxiliary loss'. It should be minimized but I don't know it's influence on the whole model's R_D performance.
Should I adjust lr for aux optimizer according to some metrics when training my own model?
In tutorial "Custom model", the example looks like the Factorized Prior model, bmshj2018-factorized. The Rate and Distortion are included in main loss, and I don't know how aux loss works.
Thanks for answers.