suragnair / alpha-zero-general

A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more
MIT License
3.84k stars 1.03k forks source link

Issue in loss in pytorch version? #282

Closed cestpasphoto closed 1 year ago

cestpasphoto commented 1 year ago

When debugging my fork, I realized something weird on the pytorch versions: the NN output is F.log_softmax(pi, dim=1), and predict() function therefore returns torch.exp() of such output. But the train() function computes the loss using the raw NN output, without exp(). Isn't that an issue? For instance in othello, in https://github.com/suragnair/alpha-zero-general/blob/master/othello/pytorch/NNet.py, refer to lines 94 and 64.

I have compared to tensorflow version: the NN use regular softmax, not log_softmax, and therefore there is no missing exp() issue. Maybe we could do same on pytorch? I could provide a pull request if needed.

cestpasphoto commented 1 year ago

After some code review, this is embarrassing for me since I found out there is no issue :smile: Using "log_softmax" optimizes the computation of the categorical crossentropy loss: that is why there is no call to exp() during training.

suragnair commented 1 year ago

Thank a lot for looking into the code in detail! Certainly helps when people analyse it independently.