Dear author,
Thanks for your great work. Recently I do the code reproduction and fine a small issue which I'm not sure. When I finish the stage1 training and start to train the stage2 I found that the loss will be negative in some cases. After checking the loss code part I thought it will not be negative theoretically. May you know what the problem is or it is not a problem due to my poor knowledge.
Dear author, Thanks for your great work. Recently I do the code reproduction and fine a small issue which I'm not sure. When I finish the stage1 training and start to train the stage2 I found that the loss will be negative in some cases. After checking the loss code part I thought it will not be negative theoretically. May you know what the problem is or it is not a problem due to my poor knowledge.