Closed marksverdhei closed 2 years ago
The paper reports the results of 1251000 checkpoints (250k steps on top of the 1m pre-training). I included another checkpoint (1,363k), just in case, anyone wanted to try a different number of steps. In an ideal world, we should have released them all. Hope this clarifies the issue.
Hi, What is the reason that you landed on a single checkpoint for each model sizes for UnifiedQA while proividing two diffrerent checkpoints for each of the sizes (e.g. UnifiedQA-v2-t5-base-1251000 vs 1363200) on the HF model hub? Are any of these the models described in the paper? There was mention of using 250k checkpoints for v2, but if the checkpoint name resembles number of steps, then perhaps these two checkpoints are not it?