Closed AlejandroTL closed 3 years ago
From https://github.com/AntixK/PyTorch-VAE/issues/11 (closed): "It is just the bias correction term for accounting for the minibatch. When small batch-sizes are used, it can lead to a large variance in the KLD value. But it should work without that kld_weight term too."
I was also wondering when I first saw it. Maybe a little FAQ in the README would be helpful since there seem to be several issues referencing this.
Related question: https://github.com/AntixK/PyTorch-VAE/issues/40 Should the dimension of the input and the latent vector play a role instead / as well?
Hi!
Maybe it's a silly question but why do you use a KL Weight term? I understand that it's the percentage that a batch is over the total dataset. For instance, if there are 100 observations and the batch size is 10, the kl_weight should be 0.1, but why do you use it? I've seen some other implementations and doesn't find it. I'm sure there's a reason but I cannot find why weight just the KL Divergence and no the reconstruction loss.
Thank you so much! :)