blei-lab / edward

A probabilistic programming language in TensorFlow. Deep generative models, variational inference.
http://edwardlib.org
Other
4.83k stars 761 forks source link

Difference in evaluated and computed mean absolute error #930

Open g7dhaliwal opened 5 years ago

g7dhaliwal commented 5 years ago

Hi,

I am having the similar issue as posted in this stackoverflow question https://stackoverflow.com/questions/44931918/how-to-obtain-prediction-results-in-edward

Using ed.evaluate('mean_absolute_error', data={X: X_test, y_post: y_test}, n_samples=100), I computed MAE which is entirely different than numpy computed value. Below is the code training a simple linear regression model (copied from one of the tutorials online). I extended this code by getting posterior samples of the weights. Using this posterior samples, I obtained array of prediction values. Further, I computed mean of the prediction values computed across the sample of the posterior distribution of the weights, denoted as y_mean_test.

Mean absolute error is computed as mean of abs(y_mean_test - y_test). This error is different than one using ed.evaluate(). Edward computed MAE:0.22587228 Numpy computed MAE: 3.171210873542781 Am I missing a fundamental concept here? Will really appreciate if someone can help me understanding the discrepancy.

def build_toy_dataset(N, w, noise_std=0.1): D = len(w) x = np.random.randn(N, D) y = np.dot(x, w) + np.random.normal(0, noise_std, size=N) return x, y

N = 40 # number of data points D = 10 # number of features

w_true = np.random.randn(D) X_train, y_train = build_toy_dataset(N, w_true) X_test, y_test = build_toy_dataset(N, w_true)

X = tf.placeholder(tf.float32, [N,D]) w = Normal(loc=tf.zeros(D), scale=tf.ones(D)) b = Normal(loc=tf.zeros(1), scale=tf.ones(1)) y = Normal(loc=ed.dot(X,w)+b, scale=tf.ones(N))

qW = Normal(loc=tf.get_variable("qw/loc", [D]), scale=tf.nn.softplus(tf.get_variable("qw/scale",[D]))) qB = Normal(loc=tf.get_variable("qB/loc", [1]), scale=tf.nn.softplus(tf.get_variable("qB/scale",[1]))) inference = ed.KLqp({w:qW,b:qB}, data={X:X_train, y:y_train}) inference.run(n_samples=5, n_iter=250)

y_post = ed.copy(y, {w:qW, b:qB})

print("MAE using edward: ") print(ed.evaluate('mean_absolute_error', data={X: X_test, y_post: y_test}, n_samples=100))

MAE using edward: 0.22587228

COMPUTING MAE USING NUMPY

n_samples = 100

getting posterior samples

w_samples = w.sample(n_samples).eval().reshape(1,D,n_samples) b_samples = b.sample(n_samples).eval().reshape(1,n_samples) X_test = X_test.reshape(N,D,1)

obtaining predictions for each posterior sample point

yPred = np.sum(X_test*w_samples, axis=1) + b_samples

obtaining mean across posterior sample

y_mean_test = np.mean(yPred, axis=1).reshape(N,1) y_test = y_test.reshape(N,1) print("MAE using numpy: ", np.mean(np.abs(y_test - y_mean_test)))

MAE using NUMPY: 3.171210873542781