relationship between VAEAC and CVAE in my problem

tigvarts / vaeac

Variational Autoencoder with Arbitrary Conditioning

MIT License

80 stars 23 forks source link

Thanks for sharing with your great work!

First, I will describe my problem. I want to use some generative model (VAEAC or CVAE) to predict future trajectory, say 3 seconds, given the past 2 seconds trajectory, and this is the case that mask variable bin VAEAC is constant in my problem.

The generative process of VAEAC is similar to the generative process of CVAE: for each object firstly we generatez ~ p (z|x1-b; b) using prior network, and then sample unobserved features xb ~ p(xb|z; x1-b; b) using generative network. So VAEAC still uses three trainable network, recognition network, prior network and generation network, latent variable z is sampled from recognition network during training and from prior network duding test. So does VAEAC has the same problem as CVAE that VAEAC can have good reconstructions of ygiven z sampled from recognition network with q(z|x; b), while samples of ygiven zsampled from recognition network with p(z|x1-b; b) are not realistic.

So my first question is: is VAEAC exactly the same as CVAE in my case?

My second question is: in arbitary condition case, what is the difference between VAEAC and CVAE except for introducing distribution p(b), the regularization in latent variable distribution and consideration of missing features in input? Is the loss of VAEAC is equal to that of CVAE?

My third question is: If we want to generate different output, so we need to sample different latent variable z for different output?

Furthermore, explanations in GaussianCategoricalLoss class has a conflict with that in the README.md, GaussianCategoricalLoss class in prob_utils.py: For example, if one_hot_max_sizes is [3, 1, 1, 2], then the distribution parameters for one object is the vector [p_00, p_01, p_02, p_03, mu_1, sigma_1, mu_2, sigma_2, p_30, p_31], where Softmax([p_00, p_01, p_02, p_03]) and Softmax([p_30, p_31]) are probabilities of the first and the fourth feature categories respectively in the model generative distribution, and Gaussian(mu_1, sigma_1 ^ 2) and Gaussian(mu_2, sigma_2 ^ 2) are the model generative distributions on the second and the third features. README.md: For example, for a dataset with a binary feature, three real-valued features and a categorical feature with 10 classes the correct --one_hot_max_sizes arguments are 2 1 1 1 10.

Thank you for your interest! I answer your questions below in the text.

So does VAEAC has the same problem as CVAE that VAEAC can have good reconstructions of y given z sampled from recognition network with q(z|x; b), while samples of y given z sampled from recognition network with p(z|x1-b; b) are not realistic.

We don't observe this problem in our experiments. Furthermore, Gaussian Stochastic Neural Network and Hybrid Model proposed as a solution for this problem in CVAE paper sometimes even decrease the model's quality in terms of NLL and the diversity of generated samples in our experiments. We devoted the Appendix C of our paper to the analysis of these effects; you can find more information there. In a nutshell, we don't recommend using GSNN and Hybrid Model when your target conditional distribution has multiple local maximums which are different enough.

is VAEAC exactly the same as CVAE in my case?

You are right, with constant b VAEAC turns into CVAE. VAEAC has additional regularization in latent space, but it is not very important in practice.

in arbitrary condition case, what is the difference between VAEAC and CVAE

The difference is that CVAE is not applicable to the arbitrary condition case, because it requires to fix the structure (roughly saying, the dimensionality) of random variables x and y (or x_{1-b} and x_b in VAEAC notation) at the training stage. VAEAC is an extension of CVAE designed to handle the arbitrary conditioning case at both training and testing stage. So in the arbitrary conditional case the losses are different (or even incomparable because they are related to different problems). But with constant b VAEAC loss turns exactly into CVAE loss (if we forget about regularization in the latent space).

If we want to generate different output, so we need to sample different latent variable z for different output

Yes, that is what you need to do.

explanations in GaussianCategoricalLoss class has a conflict with that in the README.md

It is just two different examples of one_hot_max_sizes, but it is probably better if function docstring and README consider the same example, thank you for noticing this.

tigvarts / vaeac

relationship between VAEAC and CVAE in my problem #4