nfmcclure / tensorflow_cookbook

Code for Tensorflow Machine Learning Cookbook
https://www.packtpub.com/big-data-and-business-intelligence/tensorflow-machine-learning-cookbook-second-edition
MIT License
6.24k stars 2.41k forks source link

Bug in Non-Linear SVM Kernel Calculation #103

Closed fbcotter closed 6 years ago

fbcotter commented 7 years ago

I think there's a bug in the calculation of the Gaussian Kernel for the multiclass SVM. Your code is as follows:

# Gaussian (RBF) kernel
gamma = tf.constant(-10.0)
dist = tf.reduce_sum(tf.square(x_data), 1)
dist = tf.reshape(dist, [-1,1])
sq_dists = tf.multiply(2., tf.matmul(x_data, tf.transpose(x_data)))
my_kernel = tf.exp(tf.multiply(gamma, tf.abs(sq_dists)))

Initially it bothered me that the variable dist was calculated but never used. I then tried to figure out what you were doing in this code, and I think it is supposed to be doing something like this (dist is K in the below example)

image

In which case, the code should read something more like:

# Gaussian (RBF) kernel
gamma = tf.constant(-10.0)
K = tf.reduce_sum(tf.square(x_data), 1)
K_cols = tf.expand_dims(K, 1)
K_rows = tf.expand_dims(K, 0)

D = K_cols + K_rows - 2*tf.matmul(x_data, tf.transpose(x_data))
my_kernel = tf.exp(tf.multiply(gamma, D))

(The sq_dists calculation uses broadcasting so we don't need to create the J matrix as in the math above)

nfmcclure commented 6 years ago

HI @fbcotter , Thanks for this! I'm just getting around to triaging issues in preparation for a v2-update of the code.

I suspect you are correct. When I get around to chapter 4, I'll investigate and most likely accept your proposal. Expect more in the next month or so. Thanks again!

nfmcclure commented 6 years ago

Yes, thanks. I just looked into that and found you are correct. I've implemented a very similar fix and kept the 'dist' variable terminology in both the *.py and the jupyter notebook. Thanks again!