tensorflow / probability

Probabilistic reasoning and statistical analysis in TensorFlow
https://www.tensorflow.org/probability/
Apache License 2.0
4.25k stars 1.1k forks source link

log_cdf not available in MultivariateNormalFullCovariance #930

Open jonas-eschle opened 4 years ago

jonas-eschle commented 4 years ago

In the distribution MultivariateNormalFullCovariance, the log_cdf is not available, because the event_shape is overriden. Is there an easy way to circumvent this? Why is this actually the case here?

Snippet to reproduce the issue (taken from the docs + added a log_cdf call:

from tensorflow_probability import distributions as tfd

b = tfd.Bernoulli(logits=tf.zeros([3, 5, 7, 9]))
b.batch_shape  # => [3, 5, 7, 9]
b2 = b[:, tf.newaxis, ..., -2:, 1::2]
b2.batch_shape  # => [3, 1, 5, 2, 4]

x = tf.random.normal([5, 3, 2, 2])
cov = tf.matmul(x, x, transpose_b=True)
chol = tf.cholesky(cov)
loc = tf.random.normal([4, 1, 3, 1])
mvn = tfd.MultivariateNormalTriL(loc, chol)
mvn.log_cdf(0.2)  # <- errors
davmre commented 4 years ago

I don't believe TFP has an implementation of multivariate normal CDFs, and my understanding is that this is because it's actually a somewhat nontrivial problem.

The 'naive' thing you could imagine trying is what a univariate transformed distribution does---when you call td.log_cdf(x), it pulls x back into the space of the base distribution and computes the cdf there (td.distribution.log_cdf(tf.bijector.inverse(x))). But that's not correct for multivariate distributions in general.

Suppose we define a multivariate normal x = A z + b as an affine transformation of a standard-normal vector z, and the multivariate CDF of x as the total probability mass 'below and to the left' of x, i.e., expectation_wrt_y( prod([indicator(y[i] < x[i]) for i in range(dims)]). The problem is that the affine transformation A can include a rotation component. This means that  'below and to the left' will in general mean something different for x, in the affine-transformed space, than it did for the original variable z. So we can't compute the multivariate CDF just by transforming back to the original space and querying the standard normal CDF. You have to do something more bespoke that actually integrates over the relevant slice of the multivariate distribution.

There are algorithms to compute multivariate normal CDFs. I'm not an expert; I think Genz (1993) is one of the classic papers: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.33.9631&rep=rep1&type=pdfalthough it looks like the state of the art might have improved a bit since then, e.g., in Botev (2016) https://arxiv.org/abs/1603.04166.

Assuming that computations at reasonable accuracy can be done performantly (at least up to some dimensionality), we'd certainly be happy to see a CDF implementation for multivariate normals as a pull request.

jonas-eschle commented 4 years ago

I see, that makes sense, thanks a lot for the extensive explanation! I am not an expert myself either, but I will keep it in mind, since we may gonna have someone working on it.