axolotl-ai-cloud / axolotl

Go ahead and axolotl questions
https://axolotl-ai-cloud.github.io/axolotl/
Apache License 2.0
7.48k stars 808 forks source link

Fix untrained tokens #1771

Closed winglian closed 1 month ago

winglian commented 1 month ago

mistral 12b has a bunch of untrained reserved tokens that are useful for using as chatml tokens, this will allow us to use them without having inf as the grad norm.

ehartford commented 1 month ago

Weird I'm training with no inf/nan, and I started before this went into main