about the cross entropy

sgxzz1 commented 10 months ago

thanks a lot for your work

When you use the "nll_loss()" from the "torch.nn.functional", I found that you just use the softmax() but you didn't use the log(). I don't understand why not use the log(). Because if you want to calculate the cross entropy, the correct steps are softmax, log, nll_loss() 无log的nll_loss

dnguyengithub commented 10 months ago

Hello,

You're right. It's an error. That explains why the multi-resolution of L_CE was not as good as expected. (We wrote in the paper: "We empirically observed that the prediction could be marginally improved if we use a multi-resolution version of LCE"). So, in conclusion:

The basic L_CE was implemented correctly.
There was an error in the implementation of the multi-resolution version (when blur == True)

You can achieve a similar performance to what is reported in the paper by using the basic L_CE.

I'll fix the multi-resolution version and rerun the rest asap.

sgxzz1 commented 10 months ago

Thank you for your reply, this has helped me a lot, I have been bothered by this issue for a long time

CIA-Oceanix / TrAISformer

about the cross entropy #26