Open vandana-rajan opened 5 years ago
@vandana-rajan It seems your issue fell through the cracks. I hope you've solved your problem by now, but in case it helps others...
KLDivergenceRegularizer
penalizes KL distance between distributions. LSTM's output is not a distribution. Probably what you want is the output of the LSTM layer to feed into the input of a tfpl.IndependentNormal
layer, and to put the regularizer on that layer.
Note that tfpl.IndependentNormal
requires twice as many parameters as you pass to the tfd.MultivariateNormalDiag
's loc
kwarg because it needs parameters for both the loc
and the scale_diag
.
N_LATENT = 64
enc.add(LSTM(units=2 * N_LATENT , activation='tanh', return_sequences=False,))
prior = tfd.MultivariateNormalDiag(loc=tf.zeros(N_LATENT))
regu = tfpl.KLDivergenceRegularizer(prior)
enc.add(tfpl.IndependentNormal(event_shape=(N_LATENT,), activity_regularizer=regu)
I don't see why you couldn't use use_exact_kl=True
in this example.
I haven't tested the above but it should look something like that. Note the 2*N_LATENT for the number of LSTM units. There are other details that you might want to look into, like shifting and softplus-transforming the second half of the parameters in the MultivariateNormalDiag dist, but that's outside the scope of this answer.
I wanted to enforce a PDF on the intermediate layer of a DNN. The solution I got was from the link https://stackoverflow.com/questions/57920804/how-to-enforce-a-probability-distribution-on-intermediate-layer-features
I implemented it as follows
enc.add(LSTM(units=64,activation='tanh',return_sequences=False,activity_regularizer=args.regularizer))
where,args.regularizer = tfpl.KLDivergenceRegularizer(tfd.MultivariateNormalDiag(loc=tf.zeros(64)),use_exact_kl=False)
I am getting the following error:
WARNING: Logging before flag parsing goes to stderr. W0915 14:05:54.929620 47844246007424 deprecation.py:506] From /data/scratch/eex608/Anaconda/envs/Py3/lib/python3.6/site-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.init (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version. Instructions for updating: Call initializer instance with the dtype argument instead of passing it to the constructor Traceback (most recent call last): File "cnn_lstm_tfk.py", line 90, in
model = emo1d(input_shape=x_tr.shape[1:],num_classes=len(np.unique(np.argmax(y_tr, 1))),args=args)
File "cnn_lstm_tfk.py", line 31, in emo1d
enc.add(LSTM(units=64,activation='tanh',return_sequences=False,activity_regularizer=args.regularizer))
File "/data/scratch/eex608/Anaconda/envs/Py3/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py", line 457, in _method_wrapper
result = method(self, *args, kwargs)
File "/data/scratch/eex608/Anaconda/envs/Py3/lib/python3.6/site-packages/tensorflow/python/keras/engine/sequential.py", line 192, in add
output_tensor = layer(self.outputs[0])
File "/data/scratch/eex608/Anaconda/envs/Py3/lib/python3.6/site-packages/tensorflow/python/keras/layers/recurrent.py", line 619, in call
return super(RNN, self).call(inputs, kwargs)
File "/data/scratch/eex608/Anaconda/envs/Py3/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 664, in call
self._handle_activity_regularization(inputs, outputs)
File "/data/scratch/eex608/Anaconda/envs/Py3/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 1629, in _handle_activity_regularization
activity_loss = self._activity_regularizer(output)
File "/data/scratch/eex608/Anaconda/envs/Py3/lib/python3.6/site-packages/tensorflow_probability/python/layers/distribution_layer.py", line 1214, in call
return self._kl_divergence_fn(distribution_a)
File "/data/scratch/eex608/Anaconda/envs/Py3/lib/python3.6/site-packages/tensorflow_probability/python/layers/distribution_layer.py", line 1335, in _fn
kl = kl_divergence_fn(distribution_a, distributionb)
File "/data/scratch/eex608/Anaconda/envs/Py3/lib/python3.6/site-packages/tensorflow_probability/python/layers/distribution_layer.py", line 1323, in kl_divergence_fn
input_tensor=distribution_a.log_prob(z) - distribution_b.log_prob(z),
AttributeError: 'Tensor' object has no attribute 'log_prob'
Package details:
tensorboard 1.14.0 py36hf484d3e_0 anaconda tensorflow 1.14.0 gpu_py36h3fb9ad6_0 anaconda tensorflow-base 1.14.0 gpu_py36he45bfe2_0 anaconda tensorflow-estimator 1.14.0 py_0 anaconda tensorflow-gpu 1.14.0 h0d30ee6_0 anaconda tensorflow-probability 0.7.0
keras-applications 1.0.8 py_0 anaconda keras-base 2.2.4 py36_0 anaconda keras-gpu 2.2.4 0 anaconda keras-preprocessing 1.1.0 py_1 anaconda
python 3.6.9 h265db76_0
Can anyone please let me know what this error is?