I downloaded the checkpoint that you realsed (https://drive.google.com/drive/folders/17GHwKRZbQfyC9-7oEpzCG8pp_rAI0cOm), I found the checkpoint's loss is about 1e02 but not 1e04 when I run trainer.py on full dataset. Is this the real checkpoint after 214501 iterations? Maybe there is something wrong with my code...
I downloaded the checkpoint that you realsed (https://drive.google.com/drive/folders/17GHwKRZbQfyC9-7oEpzCG8pp_rAI0cOm), I found the checkpoint's loss is about 1e02 but not 1e04 when I run trainer.py on full dataset. Is this the real checkpoint after 214501 iterations? Maybe there is something wrong with my code...