Open jacksonloper opened 6 years ago
Hi Jackson! This is a great request. Others might have more to say here, but I think the current state of things is that walking the TF graph is strongly discouraged, and code that tries to do so using unpublished APIs is subject to breakage without notice. The brittleness of graph-walking in Edward was a primary motivation for the development of Edward2, which uses its own tracing mechanism to avoid directly walking the TF graph.
Subject to this restriction, you might find that the expectation
utility (https://github.com/tensorflow/probability/blob/master/tensorflow_probability/python/monte_carlo.py#L29) does some of what you're asking for: given an explicit source of randomness, it returns a Monte Carlo expectation with unbiased stochastic gradient, using the reparametrization or score-function estimators as appropriate. You can use this to effectively construct stochastic computation graphs, albeit in perhaps a slightly lower-level way than you're thinking of. We'd certainly be excited about designs to make this sort of functionality more cleanly accessible.
Is there a white paper somewhere outlining to scope of the API edward2 is planning to compass? Or is the main idea just to rewrite Edward in a less brittle way?
@dustinvtran want to take this?
Algorithmic construction of surrogates to estimate gradients of expected values has always seemed like a natural feature for tensorflow. I think we tried it a few years back but it never got off the ground. Maybe the time is now. Possibly even using modern surrogates such as dice, that accomodate higher order derivatives. There is also some rumbling about this in the edward community (cf. this issue), but I thought I would mention it here to see what the tensorflow probability community thought.
If you're not familiar with the so-called "stochastic computational graph" (SCG) scene, the bottom line is this:
Say we want to estimate the gradient of the expected value of a random variable with respect to some parameters. If we can use the reparametrization trick then it turns out to be really easy -- but in many cases that trick doesn't apply. In particular, consider the following case:
loss
is a random tensor, whose distribution is somehow determined by another tensorT
. For example, maybeloss
is a sample from a negative binomial distribution, andT
gives the alpha parameter. Or maybeloss
is some complicated function of a sample from a negative binomial distribution whereT
gives the alpha parameter.sess.run(loss)
that will give me a sample fromloss
, which can be understood as an unbiased estimator for the expected value ofloss
.sess.run(tf.gradients(loss,T))
that will generally not be an unbiased estimator for the derivative of the expected value ofloss
with respect toT
.However, at least as of 2016 we now know how to write a general function
surrogate(loss)
that crawls the graph and automatically produces a tensorloss_surrogate
so thatsess.run(tf.gradients(surrogate(loss),T))
then I will get an unbiased estimator for the derivative of the expected value ofloss
with respect toT
.To work, the algorithm basically just needs to be able to compute the pmf of pdf for any op which is stochastic in a way that depends on its input. In most cases we can write any complicated random stuff in terms of compositions of simple distributions for which we know the likelihood, so this is no problem. The algorithm can then define a
loss_surrogate
tensor which will let you get estimators of the gradient of expected values. Note you don't have to know ahead of time what you might want to take the gradient with respect to.It would be super nice to implement this
surrogate
function for tf. I think it would actually be fairly straightforward to implement, but we would definitely need community support to keep it maintained. We would need corner cases for random ops for which the density can't be written down. Moreover, anytime someone invents a new way of drawing randomness, we would need to think about how to make sure it plays nice with whateversurrogate(loss)
function we might cook up.