Open 200N opened 4 months ago
Just stop the training then resume from the last checkpoint. I have mentioned it in the readme here.
It's a bug that I did not solve. Most likely from the distributed training code.
你好,请问bug现在解决了吗 我在训练时会出现这种情况
Message ID: @.***>