We tested the checkpoint before cooldown and the final checkpoint of stable lm2 1.6b , and found that there was a big difference in the results of downstream tasks. The final checkpoint significantly improved the results of downstream tasks. Are there any special strategies for the cooldown phase?
arc(25shot) hellaswag(10shot) mmlu(5-shot) truthfulqa winnogrande(5-shot) gsm(5-shot)
43.52 70.3 39.8 36.61 64.17 17.29
38.4 67.59 30 34.9 61.96 7.35
We tested the checkpoint before cooldown and the final checkpoint of stable lm2 1.6b , and found that there was a big difference in the results of downstream tasks. The final checkpoint significantly improved the results of downstream tasks. Are there any special strategies for the cooldown phase? arc(25shot) hellaswag(10shot) mmlu(5-shot) truthfulqa winnogrande(5-shot) gsm(5-shot) 43.52 70.3 39.8 36.61 64.17 17.29 38.4 67.59 30 34.9 61.96 7.35