Open JinwoongKim opened 6 years ago
loss = 0.0
for i in xrange(num_train):
scores = X[i].dot(W)
scores -= np.max(scores)
sum_esj = np.sum(np.exp(scores))
loss += np.log(np.exp(scores[y[i]])/sum_esj)*-1
For gradient descent, I found these codes,
loss = 0.0
for i in xrange(num_train):
scores = X[i].dot(W)
scores -= np.max(scores)
sum_esj = np.sum(np.exp(scores))
loss += np.log(np.exp(scores[y[i]])/sum_esj)*-1
for j in xrange(num_classes):
if j == y[i]:
dW[:, y[i]] += (-1) * (sum_esj - np.exp(scores[y[i]])) / sum_esj * X[i]
else:
dW[:, j] += np.exp(scores[j]) / sum_esj * X[i]
But couldn't found any equation and note;
softmax_loss_naive