tensorflow / probability

Probabilistic reasoning and statistical analysis in TensorFlow
https://www.tensorflow.org/probability/
Apache License 2.0
4.26k stars 1.1k forks source link

Question: how to train neural network gaussian mixture model with tfp #281

Closed jmamath closed 5 years ago

jmamath commented 5 years ago

Hello, I am trying to write a simple gaussian mixture model using tfp. Basically I want to implement this tutorial: http://blog.otoro.net/2015/11/24/mixture-density-networks-with-tensorflow/ with tensorflow probability. but it seems that the weights are not carried within tfp distributions. Here is my try: from keras.models import Model from keras.layers import Dense, Input from keras import regularizers import tensorflow as tf import tensorflow_probability as tfp

# Here we try to find a mapping from y to x ## Initialize data x_data = np.linspace(-10,10,100) r_data = np.random.randn(100) y_data = 7*np.sin(x_data*0.75)+ x_data + r_data

x_data = x_data.reshape(-1,1) y_data = y_data.reshape(-1,1)

plt.scatter(y_data, x_data, s=50, facecolors='none', edgecolors='r')

y = Input(shape=(1,)) x = Input(shape=(1,))

hidden_units = 20 k_mixt = 24

hidden = Dense(hidden_units, activation='tanh', kernel_regularizer=regularizers.l2(0.01))(y) alpha = Dense(k_mixt, activation='softmax')(hidden) mu = Dense(k_mixt, activation='linear')(hidden) sigma = Dense(k_mixt, activation='exponential',name='sigma')(hidden)

gm = tfp.MixtureSameFamily( mixture_distribution=tfd.Categorical( probs=alpha), components_distribution=tfd.Normal( loc=mu, scale=sigma))

log_loss = -tf.reduce_sum(gm.log_prob(tf.reshape(x,(-1,)))) train_ops = tf.train.AdamOptimizer().minimize(log_loss)

sess = tf.Session() sess.run(tf.initialize_all_variables()) z = sess.run(train_ops, feed_dict={x:x_data,y:y_data}) sess.close()

How would you do it ? I have been trying for a week and I don't see any hope to do that. Thanks

hellocybernetics commented 5 years ago

Hi, can you provide more information about the problem you're having?

I think you getting an error, If so you should try to use tf.keras module instead of pure keras. The pure keras uses data type differ from tensorflow even if you use tensorflow backend for keras, while tf.keras use tf.Tensor.

jmamath commented 5 years ago

Hello, I tried to replace every keras call by a tf.keras, but it still not work. Simply, if I try to run the variable z above I get a None type. Whereas if I run the log_loss variable I get a number. It is like we cannot use automatic differentiation through tfp distribution, am I right ?

Another use case of the problem I am trying to solve with edward1 http://edwardlib.org/tutorials/mixture-density-network

hellocybernetics commented 5 years ago

Sorry, I am a user of Tensorflow eager execution, and I don't understand Tensorflow praph's ops very well. So, I have a question. The optimizer.minimize() method return None when sess.run(). Is not that correct behavior? I think what you want to get is log_loss, does't it?

At below code, observed loss was decreasing.

import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras import regularizers 
import tensorflow_probability as tfp

tfd = tfp.distributions
# tf.enable_eager_execution()

# Here we try to find a mapping from y to x
## Initialize data
x_data = np.linspace(-10,10,100)
r_data = np.random.randn(100)
y_data = 7*np.sin(x_data*0.75)+ x_data + r_data

x_data = x_data.reshape(-1,1)
y_data = y_data.reshape(-1,1)

plt.scatter(y_data, x_data, s=50, facecolors='none', edgecolors='r')

y = Input(shape=(1,))
x = Input(shape=(1,))

hidden_units = 20
k_mixt = 24

hidden = Dense(hidden_units, activation=tf.nn.tanh, 
               kernel_regularizer=regularizers.l2(0.01))(y)
alpha = Dense(k_mixt, activation=tf.nn.softmax)(hidden)
mu = Dense(k_mixt, activation=None)(hidden)
sigma = Dense(k_mixt, activation=tf.nn.softplus,name='sigma')(hidden)

gm = tfd.MixtureSameFamily(
mixture_distribution=tfd.Categorical(
probs=alpha),
components_distribution=tfd.Normal(
loc=mu, 
scale=sigma))

log_loss = -tf.reduce_sum(gm.log_prob(tf.reshape(x,(-1,)))) 
train_ops = tf.train.AdamOptimizer().minimize(log_loss)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(10000):
        z, loss = sess.run([train_ops, log_loss], 
                            feed_dict={x:x_data,y:y_data})
        if (i+1) % 1000 == 0:
            print(str(i+1) + " : " + str(loss))

    pred_weight, pred_mean, pred_std = sess.run([alpha, mu, sigma], 
                                                feed_dict={y:y_data})       
jmamath commented 5 years ago

Well thank you ! It seems that sometimes it's good to have an outside look