[Help] What is the best way of representing a joint distribution of two variables?

tensorflow / probability

Probabilistic reasoning and statistical analysis in TensorFlow

https://www.tensorflow.org/probability/

Apache License 2.0

4.26k stars 1.1k forks source link

[Help] What is the best way of representing a joint distribution of two variables? #147

Closed shouldsee closed 5 years ago

shouldsee commented 6 years ago

I cannot find any Distribution() in tensorflow_probability.python.distributions that combines existing distribution to create a joint distribution. Why is this the case?

import tensorflow_probability.python.distributions as tfdist
import tensorflow as tf

sess =tf.InteractiveSession()

x1 = tfdist.Normal(0.,1.,)
# x1 = tfdist.Normal([0.,]*3,[1.]*3,)
x2 = tfdist.Gamma(1.,2.)

from joint_dist import JointScalar

j = JointScalar([x1, x2,]) #### Does such thing exists anywhere, like in edward2?
Y = j.sample(1000).eval()
L = j.log_prob(Y).eval()

My current solution is here joint_dist.

deoxyribose commented 6 years ago

See https://www.tensorflow.org/probability/api_docs/python/tfp/distributions/Independent

shouldsee commented 6 years ago

@deoxyribose But Independent() does not seem to digest a list of different variables like Independent( [Gamma(1.,1.),Normal(0.,1.)] ) right?

deoxyribose commented 6 years ago

Right, should have read your post more closely :) Seems like a useful feature to have!

brianwa84 commented 5 years ago

We've discussed the idea of concatenating & slicing distributions, but haven't yet settled on a suitable API for it. This proposal works for the scalar case, but what would happen when one's a vector and one's a matrix, or worse one is float32 and one is int32 (Categorical)?

brianwa84 commented 5 years ago

srvasude commented 5 years ago

Closing since I believe https://github.com/tensorflow/probability/blob/f35f3aff6fff6f6873ab2fc5dd2b77b1738fedc7/tensorflow_probability/python/distributions/joint_distribution.py should solve this problem

arainboldt commented 4 years ago

@srvasude thanks for the link, the JointDistributionSequential class is interesting. But what if you want to create a joint distribution where each component distribution is dependent on all other components. This class seems, as the name would suggest, enforce a sequential dependency.

as per:

Unlike tf.keras.Sequential, each function can depend on the output of all previous elements rather than only the immediately previous.

and

Each list element implements the i-th full conditional distribution, p(x[i] | x[:i]).

Is there a simple way to subclass tfp.distributions.JointDistribution to generate a concatenation of Categorical distributions?

davmre commented 4 years ago

But what if you want to create a joint distribution where each component distribution is dependent on all other components.

Every joint distribution can be factored into a sequence of conditional distributions, so it's possible in principle, even if it might not be the most convenient representation. Can you give a more specific example of a distribution you're trying to represent?

arainboldt commented 4 years ago

Yea, basically, I'm trying to model large arrays of survey data. So I have ~250 multinomial/categorical distributions with a variety of event shapes, and I want to model their joint distribution using a GAN and/or VAE. Currently, I've got tfd.Categorical distros in a list and I'm managing all the indexing and processing alright, but I would like to set it up as a sub-class of tfp.Distribution in order to facilitate training flows.

davmre commented 4 years ago

Ah, that makes sense. Yeah, there's no great TFP-idiomatic way to split up a deep generative model into a joint distribution over multiple parts at the moment. In the medium term, we're looking at extending the Bijector API to allow joint inputs or outputs, but that's still a while away.

That said I'm still a bit confused what you're hoping to do. For VAEs and GANs, the marginal distribution over the generator outputs usually isn't tractable anyway. The conditional distribution over outputs given the latents cases is often nice (e.g. the decoder in an binarized MNIST VAE spits out the logits of 28*28 Bernoulli pixels, which are independent conditioned on the latents even though they're not marginally independent) and can be represented using TFP tools, but I'm not sure you could fit the marginal distribution of a GAN or VAE into a Distribution object --- AFAICT you'd have to give up on computing log_prob in favor of some other loss function like the ELBO?

arainboldt commented 4 years ago

Hi again @davmre,

Is there any reason why it wouldn't work or wouldn't be advisable to use the tfp.JointDistribution class for the above described use case? It seems like it would be a ideal situation to use it.

Also, from what I understand, please correct me, ELBO is the sum of KL-div and log-p, so wouldn't I need to calculate log-p of the generated sample regardless?