wiseodd / controlled-text-generation

Reproducing Hu, et. al., ICML 2017's "Toward Controlled Generation of Text"
BSD 3-Clause "New" or "Revised" License
242 stars 63 forks source link

KL divergence #21

Open rainyrainyguo opened 6 years ago

rainyrainyguo commented 6 years ago

can you give some explanations of the KL divergence term? I am a little bit confused kl_loss = torch.mean(0.5 * torch.sum(torch.exp(logvar) + mu**2 - 1 - logvar, 1)) Thank you so much!

JianLiu91 commented 5 years ago

It is the KL divergence of two Gaussian distribution (i.e., the prior p(z) ~ N(0, 1) and the posterior q(z|h) ~ N(mu, var). See the original paper https://arxiv.org/pdf/1312.6114.pdf