Open ruirui88 opened 5 years ago
Hi, you wrote that your loss function use the Kullback-Leibler (KL)-divergence for the classification loss in the paper. But you impement the cross entropy function not KL in your source code. So which one ?
You are right, but the derivative of the KL-div is equal to that of the cross entropy loss. Therefore, the performance is not affected.
Hi, you wrote that your loss function use the Kullback-Leibler (KL)-divergence for the classification loss in the paper. But you impement the cross entropy function not KL in your source code. So which one ?