I noticed in the notebook for the linear regression that the cost was only being calculated for the last piece of data in each epoch.
with tf.Session() as sess:
# Initialize the variables[w and b].
sess.run(tf.global_variables_initializer())
# Get the input tensors
X, Y = inputs()
# Return the train loss and create the train_op.
train_loss = loss(X, Y)
train_op = train(train_loss)
# Step 8: train the model
for epoch_num in range(num_epochs): # run 100 epochs
for x, y in data:
train_op = train(train_loss)
# Session runs train_op to minimize loss
loss_value,_ = sess.run([train_loss,train_op], feed_dict={X: x, Y: y})
# Displaying the loss per epoch.
print('epoch %d, loss=%f' %(epoch_num+1, loss_value))
# save the values of weight and bias
wcoeff, bias = sess.run([W, b])
data is being iterated over and the loss_value that is calculated is written over each time through the loop. Thus, the loss is only for the last piece of data. Since the loss needs to be computed over all of the data being used to train, the cost function should probably be something more like the following:
def loss(X, Y):
'''
compute the loss by comparing the predicted value to the actual label.
:param X: The inputs.
:param Y: The labels.
:return: The loss over the samples.
'''
# Making the prediction.
Y_predicted = inference(X)
return tf.reduce_sum(tf.squared_difference(Y, Y_predicted))/(2*data.shape[0])
With this change above, the training section could be changed to the following (with the looping over data removed completely):
with tf.Session() as sess:
# Initialize the variables[w and b].
sess.run(tf.global_variables_initializer())
# Get the input tensors
X, Y = inputs()
# Return the train loss and create the train_op.
train_loss = loss(X, Y)
train_op = train(loss(X, Y))
# Step 8: train the model
for epoch_num in range(num_epochs): # run 100 epochs
loss_value, _ = sess.run([train_loss,train_op], feed_dict={X: data[:,0], Y: data[:,1]})
# Displaying the loss per epoch.
print('epoch %d, loss=%f' %(epoch_num+1, loss_value))
# save the values of weight and bias
wcoeff, bias = sess.run([W, b])
I noticed in the notebook for the linear regression that the cost was only being calculated for the last piece of data in each epoch.
data
is being iterated over and theloss_value
that is calculated is written over each time through the loop. Thus, the loss is only for the last piece of data. Since the loss needs to be computed over all of the data being used to train, the cost function should probably be something more like the following:With this change above, the training section could be changed to the following (with the looping over
data
removed completely):This would result in output like the following:
I would be glad to submit a pull request with these and other minor changes. Please let me know if I have some misunderstanding.