blei-lab / edward

A probabilistic programming language in TensorFlow. Deep generative models, variational inference.
http://edwardlib.org
Other
4.83k stars 760 forks source link

example of normalizing flows #284

Open dustinvtran opened 7 years ago

dustinvtran commented 7 years ago

this is easy in Edward: take any model already in Edward. define the variational family using TransformedDistribution. Run ed.KLqp (or most explicitly, run ed. ReparameterizationKLqp).

from edward.models import Normal, TransformedDistribution

def flow(x):
  pass

def inverse_flow(x):
  pass

def log_det_jacobian(x):
  pass

# MODEL
z = ... # latent variables
x = ... # observed variables

# INFERENCE
qz = TransformedDistribution(
  base_dist_cls=Normal,
  mu=tf.zeros(d),
  sigma=tf.ones(d),
  transform=flow,
  inverse=inverse_flow,
  log_det_jacobian=log_det_jacobian)

inference = ed.KLqp({z: qz}, data={x: x_data})

the majority of the labor goes in defining the flows and the log determinant of their Jacobian's.

bayerj commented 7 years ago

The problem is that while most flows (especially for those in the paper by Rezende & Mohammed) are invertible, programming the inverse is impossible; that's because invertibility does not imply an "inverse that can be easily written down with our typical analytical operators". :|

I think the only flows currently published for which the flow and its inverse are straightforward to implement are from Laurent's NICE and RVP papers. The only way out to do the others is to track z_0. This would require an API of the form log_prob(x, z0, z) or so.

bayerj commented 7 years ago

Actually, Durk Kingma's IAF is also invertible. But it is expensive, IIRC.

dustinvtran commented 7 years ago

Oh right. So basically what we'd like from TensorFlow's side is to be able to define TransformedDistribution without having to specify the inverse transformation.

bayerj commented 7 years ago

Not quite. To evaluate the log probability of a sample z_K generated by application of a flow with K steps, you will need to have access to all intermediate values z_{0:K-1}. Check equation (13) of [1]. But the API currently is some_rv.log_prob(z_sample), and not something like some_rv.log_prob(z0_sample, z1_sample, z2_sample, ...).

I think it will get more ugly than what you proposed, especially due to quite a bit of bookkeping.

[1] https://arxiv.org/abs/1505.05770

yberol commented 7 years ago

@dustinvtran Hi Dustin, is there a working toy example implementing any of the aforementioned flows in Edward?

dustinvtran commented 7 years ago

not yet. this was waiting on some internal details behind the Bijector class, which now supports much needed caching. i think most of those are in now, so i can look into it again.

stephenhelms commented 7 years ago

I just heard about normalizing flows at PyData London and was interested in trying them out. Is this something that is working or in progress now? I'm not sure how much contribution I can make since I'm still learning some of this, but also happy to help out if possible.