Closed geoffreyangus closed 9 months ago
4 files ± 0 4 suites ±0 9m 29s :stopwatch: - 17m 37s 12 tests - 2 972 9 :heavy_check_mark: - 2 962 3 :zzz: - 9 0 :x: - 1 40 runs - 2 960 28 :heavy_check_mark: - 2 953 12 :zzz: - 6 0 :x: - 1
Results for commit eaac1e41. ± Comparison against base commit d3470635.
:recycle: This comment has been updated with latest results.
Prior to this change, we used pad token at the end of target tensor. This was okay because many of the new LLMs trained with pad token == eos token. With Gemma, there is a separate eos token. The issue now is that, during generation, Gemma cannot produce an eos token, so generation never stops. We now use eos token during fine-tuning so that LLMs are guaranteed to learn how to stop during the generation step.