Open RenzhiHu111 opened 1 month ago
Hello, when training the LOLv2-syn dataset, the loss keeps becoming NaN, and adding a small constant to sqrt doesn't solve the issue. Switching from AdamW to Lion doesn't reduce the loss either.
Hello, when training the LOLv2-syn dataset, the loss keeps becoming NaN, and adding a small constant to sqrt doesn't solve the issue. Switching from AdamW to Lion doesn't reduce the loss either.