Open xiaolong1979 opened 3 years ago
Not sure if this is intended behavior, but currently TransformedDistribution does not get the batch shape from bijectors:
mu = tf.range(5, dtype=tf.float32)
dist1 = tfd.Normal(mu, 1)
dist1
#==> <tfp.distributions.Normal 'Normal' batch_shape=[5] event_shape=[] dtype=float32>
dist2 = tfb.Shift(mu)(tfd.Normal(0., 1.))
dist2
#==> <tfp.distributions.TransformedDistribution 'shiftNormal' batch_shape=[] event_shape=[] dtype=float32>
As a result when you call dist2.sample()
you get a scalar sample from the base distribution and broadcast add to mu
I think there is some work going on to have bijector also get full shape semantic sample/batch/event/ - maybe @davmre knows a bit more re road map
Thanks @junpenglao . TransformedDistribution may be the issue. I have another example with similar issue on batch_shape for log_prob method. The example used a transition function with TransformedDistribution.
tfd.Normal has loc parameter for the transition. Most other distributions are not formulated this way and rely on bijector and TransformedDistribution. It would be great to have bijector get full shape semantic sample/batch/event/.
Alternatively you can make it work by using tfd.Sample
:
gaussian_walk2 = tfd.MarkovChain(
initial_state_prior=tfp.distributions.Deterministic(0.),
transition_fn=lambda _, x: tfd.TransformedDistribution(
distribution=tfd.Sample(tfd.Normal(loc=0.0, scale=1.), x.shape),
bijector=tfp.bijectors.Shift(x)),
num_steps=10)
@junpenglao Thanks for the idea. The sample method works as expected, but the log_prob does not seem to work. Did i miss anything?
gaussian_walk2 = tfd.MarkovChain( initial_state_prior=tfp.distributions.Deterministic(0.), transitionfn=lambda , x: tfd.TransformedDistribution( distribution=tfd.Sample(tfd.Normal(loc=0.0, scale=1.), x.shape), bijector=tfp.bijectors.Shift(x)), num_steps=10)
print(gaussian_walk2.sample(3)) tf.Tensor( [[ 0. 0.6361662 0.7333202 -0.49361163 -0.2956681 1.3065095 1.2681572 0.5679376 -0.29533887 -0.05293749] [ 0. 0.85948235 -1.1862569 -0.85393983 -0.99621975 -2.624208 -0.59781265 -0.0199101 -1.1001246 -1.0604566 ] [ 0. 0.93100184 0.31189167 0.9530297 -0.5100331 -0.7756746 -2.2172916 -1.8662072 -3.9851084 -2.3052304 ]], shape=(3, 10), dtype=float32)
print(gaussian_walk2.log_prob(gaussian_walk2.sample(3)))
error:
InvalidArgumentError Traceback (most recent call last)
I think @junpenglao is correct about the issue with sampling: currently TransformedDistribution expects that the base distribution's batch shape is at least as large as the bijector's 'batch shape' (recently annotated as bijector.experimental_batch_shape
). Otherwise, the base distribution will sample fewer degrees of freedom than needed for the final result.
Using tfd.Sample
to add a dimension is almost correct, but it will incorrectly reduce over the log_prob
because Sample
adds event shape, not batch shape (ie, it defines a distribution over a vector of independent samples, rather than a batch of distributions over scalar samples). To construct a Normal distribution with batch shape, you can either pass a batch of parameters, or wrap with tfd.BatchBroadcast
:
gaussian_walk2 = tfd.MarkovChain(
initial_state_prior=tfp.distributions.Deterministic(0.),
transition_fn=lambda _, x: tfd.TransformedDistribution(
# Create a distribution with `distribution.batch_shape == x.shape`.
# This could also be `tfd.BatchBroadcast(tfd.Normal(0., 1.), to_shape=x.shape)`.
distribution=tfd.Normal(loc=0.0, scale=tf.ones_like(x)),
bijector=tfp.bijectors.Shift(x)),
num_steps=10)
x = gaussian_walk2.sample(3)
print(x.shape) # ==> [3, 10]
lp = gaussian_walk2.log_prob(x)
print(lp.shape) # ==> [3]
Now that bijectors have experimental_batch_shape
annotations, it should be possible for TransformedDistribution
to do this sort of batch broadcasting automatically. This is on my TODO list, though I don't think we have a particular timeline (contributions always appreciated!).
Hi Dave, This looks great! I will test them next week after i am back from vacation.
Thanks a lot!
Xiaolong
Sent from my iPhone
On Jun 14, 2021, at 7:43 AM, Dave Moore @.***> wrote:
I think @junpenglao is correct about the issue with sampling: currently TransformedDistribution expects that the base distribution's batch shape is at least as large as the bijector's 'batch shape' (recently annotated as bijector.experimental_batch_shape). Otherwise, the base distribution will sample fewer degrees of freedom than needed for the final result.
The tfd.Sample approach is almost correct, but it will incorrectly reduce over the log_prob because Sample adds event shape, not batch shape (ie, it defines a distribution over a vector of independent samples, rather than a batch of distributions over scalar samples). To construct a Normal distribution with batch shape, you can either pass a batch of parameters, or wrap with tfd.BatchBroadcast:
gaussian_walk2 = tfd.MarkovChain( initial_state_prior=tfp.distributions.Deterministic(0.), transitionfn=lambda , x: tfd.TransformedDistribution(
Create a distribution with
distribution.batch_shape == x.shape
.# This could also be `tfd.BatchBroadcast(tfd.Normal(0., 1.), to_shape=x.shape)`. distribution=tfd.Normal(loc=0.0, scale=tf.ones_like(x)), bijector=tfp.bijectors.Shift(x)),
num_steps=10)
x = gaussian_walk2.sample(3) print(x.shape) # ==> [3, 10]
lp = gaussian_walk2.log_prob(x) print(lp.shape) # ==> [3] Now that bijectors have experimental_batch_shape annotations, it should be possible for TransformedDistribution to do this sort of batch broadcasting automatically. This is on my TODO list, though I don't think we have a particular timeline (contributions always appreciated!).
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Hi, @davmre , there is a new error when i tried: ImportError: cannot import name 'version' from 'keras' (/usr/local/lib/python3.7/dist-packages/keras/init.py) . It was not shown up before. How do I fix it?
Here are the complete message:
ImportError Traceback (most recent call last)
Well, somehow it worked when I tried on amazon workspace and ran the following codes in the first cell. There is something going on with Colab or Keras import. Would like to hear any insight.
%matplotlib inline
from keras.models import Sequential from keras.layers import Dense, Activation, Dropout, Flatten, MaxPooling2D from keras.layers.convolutional import Conv2D from keras.layers.recurrent import SimpleRNN, LSTM, GRU from keras.utils import np_utils from keras import backend as K
from distutils.version import LooseVersion as LV from keras import version
from IPython.display import SVG from keras.utils.vis_utils import model_to_dot
from keras.datasets import mnist, fashion_mnist, imdb
from sklearn.model_selection import train_test_split
import numpy as np import matplotlib.pyplot as plt import seaborn as sns
print('Using Keras version:', version, 'backend:', K.backend()) assert(LV(version) >= LV("2.0.0"))
Can someone help to explain why the MarkovChain does not generate independent samples after using a bijector in the transition_fn? Thanks!!!
In the codes below,
gaussian_walk1
andgaussian_walk2
are expected to be the same, sincenormal(x,1)=x+normal(0,1)
. Whilegaussian_walk1.sample(5)
gives expected independent samples,gaussian_walk2.sample(5)
gives identical samples.