Understanding of `pdf(data, mu, var)`

cloudhs7 commented 2 years ago

Thank you for your excellent work! I've both read the paper (MTAD-GAT: Multivariate Time-series Anomaly Detection via Graph Attention Network) and your source code, and was wondering about _reconstruction_loss_ part, especially about the pdf(data, mu, var) function.

From your source code, reconstruction loss is calculated by adding -self._reconstruction_log_probability (finally indicates -pdf function) and -self._minusDkl.

_reconstruction_loss1 = -(self._reconstruction_log_probability + self._minusDkl)

And from the paper, reconstruction loss is calculated by adding two terms. (First: the expected negative log-likelihood of the given input, Second: Kullback-Leibler divergence).

I have problem with understanding how does this -pdf function serves same role as the first term(expected NLLloss) from the paper. I was trying to implement the reconstruction loss same as the paper but had problem with implementing the arguments of NLLloss, and found your work..!

Can you explain how does the -pdf function works as NLLloss (the expected negative log-likelihood of the given input)?

mangushev commented 2 years ago

Thanks for looking into this. Let me just say something first without digging in. We solve maximum likelihood maximization . Since tensorflow optimizer minimizing, we solve negative log likelihood to figure out model parameters. PDF is log likelihood. Not sure if that’s what you were questioning.

cloudhs7 commented 2 years ago

Thanks for your answer. I've understand that PDF function was implemented under the assumption of Gaussian distribution(both in encoder and decoder). Then in assumption of Bernoulli distribution in decoder, I can change the PDF function implementation similar to CEloss, is that right?

mangushev commented 2 years ago

In papers I have mentioned and specifically in the original variational autoencoder paper, authors discussing Gaussian and Bernoulli cases. Please follow these equations to replace. It is not necessary straight replacement. It could be simplified for Bernoulli.

cloudhs7 commented 2 years ago

Oh I will check the equations from the papers above. Thanks for your reply!

mangushev / mtad-gat

Understanding of `pdf(data, mu, var)` #3