naokishibuya / deep-learning

Deep Learning Application Examples
MIT License
165 stars 93 forks source link

Thank you and looking forward to seeing the variational auto-encoder topic #2

Open Khoa-NT opened 5 years ago

Khoa-NT commented 5 years ago

Dear Sir,

I'm really sorry. I don't know how to contact you via email so I write this on your GitHub.

How are you? I haven't seen your new post on Medium. I hope you are doing well. I just read your tutorials on Medium ( Entropy, Cross-Entropy, KL, Deconvolutional..) Thank you so much for spending time to write great tutorials. Now I can understand about entropy clearly.

I hope and I'm looking forward to seeing the tutorial about variational auto-encoder as you said in the KL tutorial.

Thank you again. Sincerely, Khoa

naokishibuya commented 5 years ago

@shaolinkhoa thanks for asking about VAE article.

I've been planning for VAE article for quite some time and I know it will be a series of articles. At this moment, however, I don't have any concrete timing when I can publish them to Medium.

For the time being, I'd recommend the following video tutorial for a quick overview.

https://www.youtube.com/watch?v=9zKuYvjFFS8

For more technical details, I'd recommend the following tutorial paper on VAE.

https://arxiv.org/pdf/1606.05908.pdf

Hope that helps.

Khoa-NT commented 5 years ago

Thank you for your quick reply. Because I'm reading the paper about VAE and they use KL in VAE just like you said "The KL divergence is used to force the distribution of latent variables to be a normal distribution so that we can sample latent variables from the normal distribution. As such, the KL divergence is included in the loss function to improve the similarity between the distribution of latent variables and the normal distribution." So that made me curious about KL in VAE.

Khoa-NT commented 5 years ago

Dear Sir, I read the "Tutorial on Variational Autoencoders". But I can't understand the part 2.1 Setting up the objective : the equation (3)

Equation three

Here, log P(X) comes out of the expectation because it does not depend on z. It made me confusing. My friend told me that we can assume logP(x) as a constant in this expectation equation and then let it out. But follow your KL tutorial, I think It should be: the second equation

But in the end, we only have logP(x). Does it mean we have the sum equal 1? Q

carlee0 commented 5 years ago

@shaolinkhoa, I think your argument is exactly correct. In your second equation you could see that X is given in the distribution Q(z|X) and P(X) does not depend on z, therefore p(X) is a constant in this summation. In the last equation the summation should indeed be evaluated to 1 if it is a discrete distribution.