blei-lab / edward

A probabilistic programming language in TensorFlow. Deep generative models, variational inference.
http://edwardlib.org
Other
4.83k stars 759 forks source link

Why rbf() matrix is not positive semi-definite for D <= 2? #789

Open russellizadi opened 7 years ago

russellizadi commented 7 years ago

I have kind of problem with rbf() function output. The code below doesn't work for D=2 and 1 while the output of rbf() is supposed to be PSD.

from edward.util import rbf
import edward as ed

sess = tf.Session()
X = tf.random_normal([100, 5])
K = ed.rbf(X)
L = tf.cholesky(K)
print(sess.run(L))

X = tf.random_normal([100, 2])
K = ed.rbf(X)
L = tf.cholesky(K)
print(sess.run(L))
russellizadi commented 6 years ago

Now I'm sure that there is an issue with ed.rbf() function because the radial basis function of a random matrix could be found using scipy functions: `import tensorflow as tf import scipy from scipy.spatial.distance import pdist, squareform

X = tf.random_normal([50, 2]) sess = tf.Session() X = sess.run(X) pairwise_dists = squareform(pdist(X, 'euclidean')) A = scipy.exp(-pairwise_dists ** 2) #rbf() L = tf.cholesky(A) sess.run(L)`

I'm not sure whether I can change the function with the help of scipy functions or I need to write pdist and squareform functions? Then I can open a pull request.

dustinvtran commented 6 years ago

rbf is written as Tensor-in Tensor-out, so scipy isn't recommended unless you're willing to fetch the input out of the graph as you write here. It would be great to look into scipy's internals to see how it overcomes the PSD issue.

YoshikawaMasashi commented 6 years ago

I think this is not necessarily a problem only with rbf.

For example, this code work in most case.

X = tf.random_normal([10, 2])
K = ed.rbf(X)
L = tf.cholesky(K)
print(sess.run(L))

On the other hand, this code doesn't work in most case.

import tensorflow as tf
import scipy
from scipy.spatial.distance import pdist, squareform

X = tf.random_normal([500, 2])
sess = tf.Session()
X = sess.run(X)
pairwise_dists = squareform(pdist(X, 'euclidean'))
A = scipy.exp(-pairwise_dists ** 2) #rbf()
L = tf.cholesky(A)
sess.run(L)

Thus, the Gram matrix composed of the rbf kernel sometimes can not perform Cholesky decomposition when N is large and D is small. This is caused by error of numerical calculation.

I Perform eigenvalue decomposition to K.

X = tf.random_normal([100, 2])
K = ed.rbf(X)
e,v = tf.linalg.eigh(K)
print(sess.run(e))

Some eigenvalues have slightly negative values. Even if I do the same calculation with numpy, this error will happen.

[-2.02615001e-06 -1.17295554e-06 -1.09173573e-06 -4.21304691e-07
 -3.66310388e-07 -2.76543318e-07 -2.41209563e-07 -2.33321103e-07
 -2.17294570e-07 -1.85838630e-07 -1.43979818e-07 -1.06772191e-07
 -9.18969718e-08 -8.08987153e-08 -7.71226212e-08 -6.76115590e-08
 -2.95344567e-08 -5.75027270e-09  7.81395215e-09  1.31351667e-08
  3.17538245e-08  5.23058823e-08  9.23645445e-08  1.09228104e-07
  1.36151129e-07  1.61092430e-07  1.98965125e-07  2.19802516e-07
  2.57777657e-07  2.84293236e-07  3.23920830e-07  3.53408836e-07
  3.66509113e-07  4.51453076e-07  5.00539159e-07  5.33806144e-07
  6.22740515e-07  7.31816442e-07  1.10045585e-06  1.20567427e-06
  1.65554218e-06  1.77853553e-06  2.50796461e-06  4.34908679e-06
  5.70953489e-06  8.38064352e-06  1.05908493e-05  1.19142778e-05
  1.79253493e-05  2.79868564e-05  3.43420725e-05  5.34625979e-05
  6.03530061e-05  1.01715254e-04  1.52667140e-04  2.44186842e-04
  3.21619795e-04  3.56368662e-04  4.00404155e-04  5.42404305e-04
  1.01348944e-03  1.31238438e-03  1.67004613e-03  1.91607641e-03
  3.39790666e-03  4.56094369e-03  5.08843083e-03  6.47558365e-03
  8.21951590e-03  1.12576941e-02  1.33762928e-02  1.62532404e-02
  3.91801372e-02  4.15632613e-02  4.65382561e-02  6.06274828e-02
  7.13901967e-02  9.48066786e-02  1.55950129e-01  1.77429289e-01
  2.40317121e-01  2.81880975e-01  3.79758894e-01  4.89038706e-01
  5.75174570e-01  7.13687718e-01  7.34729767e-01  8.74172091e-01
  1.12785423e+00  1.22224009e+00  1.39746940e+00  1.85125256e+00
  2.32930803e+00  2.71350718e+00  4.16019821e+00  6.09627485e+00
  7.07438231e+00  1.34414482e+01  1.57805233e+01  3.77524490e+01]

The solution is to make eigenvalues small positive values.

X = tf.random_normal([100, 2])
K = ed.rbf(X)

e,v = tf.linalg.eigh(K)
eps = 1e-5
e = tf.maximum(e, eps)
K = tf.matmul(tf.matmul(v,tf.diag(e)),tf.transpose(v))

L = tf.cholesky(K)
print(sess.run(L))