Closed k920049 closed 8 months ago
After carefully examining the Cross Entropy loss, I realized that the function indeed does not ignore off diagonal components. Thanks for the suggestion and carefully testing the results.
Describe your changes