Closed heiheihei-ops closed 4 years ago
Hi again @heiheihei-ops , any operations on the loss would still hold well and would leave the training process relatively unaffected since we only care for the relative value of the loss (and not its absolute value). Unfortunately, I do not have the loss curve handy :(
Hope this helps!
Thank you for your answer, but I notice you have used .mean() in your code , and I think its usage is the same as to divide by the batchsize. By the way, if it is possible to see the loss curve when you train? Thank you so much!