Closed ppham27 closed 6 years ago
You're right to think of an e2 rv as basically just a sample, though it also carries it's distribution and importantly its stochastic predecessors.
And in eager mode, TF is much like an accelerator friendly, multi threaded version of numpy.
So you can write a function to do your sampling and then use ed2 to intercept sampling actions to get a prior sample, a conditional one, a posterior one, a differentiable (some restrictions apply) likelihood, etc. Or you can use tfp.distributions directly.
On Sat, Nov 3, 2018, 3:38 PM Philip Pham <notifications@github.com wrote:
In graph mode, if I re-evaluate a random variable, I get a new sample. If I update a variable parameter, I get a new sample with that parameter:
import tensorflow as tfimport tensorflow_probability as tfp
graph = tf.Graph()with graph.as_default(): loc = tf.get_variable( 'loc', (), initializer=tf.constant_initializer(5.), use_resource=True) update_loc_op = tf.assign(loc, -5.) norm = tfp.edward2.Normal(loc=loc, scale=0.01).value init_op = tf.group(tf.global_variables_initializer()) graph.finalize() with graph.as_default(), tf.Session() as sess: sess.run(init_op) print(sess.run(norm)) print(sess.run(norm)) print(sess.run((update_loc_op, norm)))
produces the output
5.0021605 # first sample 4.9917226 # new sample (-5.0, -4.9986978) # sample with new loc parameter, works as expected.
In eager mode, no new sampling happens (which I sort of understand since I guess the whole program is just one sess.run command now). Updating parameters, doesn't lead to a new samples with that parameter either:
import tensorflow as tfimport tensorflow_probability as tfpimport tensorflow.contrib.eager as tfe
tf.enable_eager_execution()
loc = tfe.Variable(initial_value=5.) norm = tfp.edward2.Normal(loc=loc, scale=0.01) print(norm.numpy())print(norm.numpy())
loc.assign(-5.) print(loc.numpy())print(norm.numpy())print(norm.distribution.sample())
produces the output
5.022224 # first sample 5.022224 # no new sampling, so RandomVariables are just a single sample? -5.0 # Updated loc parameter 5.022224 # Random variable with loc=-5.0 hasn't updated. tf.Tensor(5.004547, shape=(), dtype=float32) # Taking a new sample still uses the old loc=5.0
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/probability/issues/212, or mute the thread https://github.com/notifications/unsubscribe-auth/AVJZIwngLRucudban3_AYvhYrrL-bo5Gks5urfCXgaJpZM4YM9t4 .
Ping if you need further guidance; also take a look at the tutorials/examples: https://www.tensorflow.org/probability/overview
I have a question that seems related to this, asked here: https://stackoverflow.com/questions/53338975/use-and-modify-variables-in-tensorflow-bijectors
It's not clear why modifying the variables used to construct distributions/bijectors has an effect in graph mode but not in eager mode...
@brianwa84 any update on this?
Just to be clear, I posted a reply on the SO question (not here).
Yes thanks for answering there. Moving the discussion here, I do think (but there might be good reasons why I'm wrong here) it would be a more intuitive behaviour in eager mode if the shift that was used by the transformed distribution object (as in the SO question) when sampling or computing log-likelihoods was a ref to the actual variable, as in graph mode, and not the evaluated numpy value (especially if eager mode is going to become more and more central in TF)...
TF is standardizing on Keras [1] (perhaps also tf.function) as the way of expressing cascades of computation like this in TF2.0. To that end we've been playing with a kind of Keras layer that emits distribution objects with user control over the tensor concretization method (default is sampling). [2] This is currently more focused on previous layer activations conditioning a downstream distribution. But I think you could now create a layer which retained a variable reference and whose new
function created a TransformedDistribution using the variable instead of using preceding-layers activations. I'm not sure what exactly is the Keras-preferred way of allocating variables to ensure they are trained, you'd need to look at some existing Keras layers like Dense.
[1] https://medium.com/tensorflow/standardizing-on-keras-guidance-on-high-level-apis-in-tensorflow-2-0-bad2b04c819a
[2] https://github.com/tensorflow/probability/blob/master/tensorflow_probability/python/layers/distribution_layer.py#L234
To answer my own question, here is the way you add a variable in Keras: self.add_weight
https://github.com/tensorflow/tensorflow/blob/r1.12/tensorflow/python/keras/layers/core.py#L937
In graph mode, if I re-evaluate a random variable, I get a new sample. If I update a variable parameter, I get a new sample with that parameter:
produces the output
In eager mode, no new sampling happens (which I sort of understand since I guess the whole program is just one
sess.run
command now). Updating parameters, doesn't lead to a new samples with that parameter either:produces the output