tensorflow / probability

Probabilistic reasoning and statistical analysis in TensorFlow
https://www.tensorflow.org/probability/
Apache License 2.0
4.23k stars 1.09k forks source link

Tensorflow Probability change of prior in tfp.layers (Possible Issue) #823

Open MMorafah opened 4 years ago

MMorafah commented 4 years ago

Hello,

My research needs to change the prior distribution parameters to train a Bayesian Neural Network model. More specifically, I want to change the prior Gaussian distribution mu and sigma.

I have read the core codes of Tensorflow Probability and implemented my own code for changing the prior. However, after during a couple of extreme experiments it turned out that the results directly contradicts the theory. Here, the problem is either with the way that I am implementing the prior, or with the Bayesian weight updates which Tensorflow implements or the derivation behind these weight updates.

Now, the easiest way is that I make sure I am implementing the change of prior in my codes correct. Can you please let me know what is the correct way to implement another prior for tfp.layers ?

My code for implementing this is as follow:

`

tfp.layers.Convolution2DReparameterization(
          filters=6, kernel_size=(5,5), padding='same', activation='relu', dtype=‘float64’,
          kernel_divergence_fn=lambda q, p, _: tfp.distributions.kl_divergence(q, p) / num_example,
          kernel_prior_fn= _Prior_fn(mean=initialization[0], scale=initialization[1]),
          bias_posterior_fn=tfp.layers.default_mean_field_normal_fn(),
          bias_prior_fn= _Prior_fn(mean=initialization[2], scale=initialization[3]),
          bias_divergence_fn=lambda q, p, _: tfp.distributions.kl_divergence(q, p) / num_example)

def _Prior_fn(mean, scale,*args, **kwargs):
    d = tfd.Normal(loc=mean, scale=scale)
    def fn(*args, **kwargs):
        return tfd.Independent(d, reinterpreted_batch_ndims=tf.size(d.batch_shape_tensor()))
    return fn

`

**Initialization[0], initialization[1] have the exact shape of the kernel_posterior, and initialization[3], initialization[4] have the exact shape of bias_posterior.

Can you please verify this code, or give me another sample code to implement my own prior to pass to tfp.layers?

Thanks a lot for you help,

nbro commented 4 years ago

@MMorafah You don't explain what you are actually trying to do, apart from showing us a custom function that creates a prior and that you want to "change the prior". Maybe you should explain what your goal is. How do you want to change the prior?

saurabhdeshpande93 commented 3 years ago

@MMorafah, could you solve the problem? I am also interested in implementing custom priors for Convolution2DFlipout layers.

biophase commented 2 years ago

@MMorafah I think you need to check what the tfpl.Convolution2DReparameterization layer passes to your custom prior function. Here's a source you can use (check the arguments for make_normal_fn): https://www.tensorflow.org/probability/api_docs/python/tfp/layers/default_mean_field_normal_fn

@saurabhdeshpande93 Here's a code snippet for tfpl.Convolution2DFlipout with custom priors (you might need to adjust the reinterpreted_batch_ndims to whatever number of kernel/bias dimensions your layer is expecting):

def get_bias_prior(dtype, shape,name, trainable, add_variable_fn):
  prior = tfp.distributions.Independent(tfp.distributions.Normal(
                                      loc = tf.zeros(shape, dtype = dtype),
                                      scale = 8.0 * tf.ones(shape, dtype = dtype)),
                                      reinterpreted_batch_ndims = 1)

  return prior

def get_kernel_prior(dtype, shape, name, trainable, add_variable_fn):
  prior = tfp.distributions.Independent(tfp.distributions.Normal(
                                      loc = tf.zeros(shape, dtype = dtype),
                                      scale = 4.0 * tf.ones(shape, dtype = dtype)),
                                      reinterpreted_batch_ndims = 4)

  return prior

tfp.layers.Convolution2DFlipout(
                                filters=filters, 
                                kernel_size=kernel_size,
                                strides=strides,
                                padding=padding,
                                activation=activation,

                                kernel_posterior_fn = tfp.layers.util.default_mean_field_normal_fn(is_singular=False),
                                kernel_posterior_tensor_fn=(lambda d: d.sample()),
                                kernel_prior_fn = get_kernel_prior,
                                kernel_divergence_fn = (lambda q, p, _: tfp.distributions.kl_divergence(q, p, allow_nan_stats=True)),

                                bias_posterior_fn = tfp.layers.util.default_mean_field_normal_fn(is_singular=False),
                                bias_posterior_tensor_fn=(lambda d: d.sample()),
                                bias_prior_fn = get_bias_prior,
                                bias_divergence_fn = (lambda q, p, _: tfp.distributions.kl_divergence(q, p, allow_nan_stats=True))
                                )

I suspect the same code should work for tfpl.Convolution2DReparameterization but I haven't tried it.