Closed irthomasthomas closed 5 months ago
If you mean for calibration, I'd still use the builtin dataset for that as well. It does contain quite a lot of code, too.
As for testing, I would test on what's most relevant to how the model is going to be used. Code makes sense for a code model, as long as you're pretty sure it's a representative sample. It's definitely risky to test just on wikitext if you want the model to also handle code well. Etc.
If you mean for calibration, I'd still use the builtin dataset for that as well. It does contain quite a lot of code, too.
As for testing, I would test on what's most relevant to how the model is going to be used. Code makes sense for a code model, as long as you're pretty sure it's a representative sample. It's definitely risky to test just on wikitext if you want the model to also handle code well. Etc.
I see, thank you for the help.
Sorry to open an issue, but just a quick question really, when quantizing code models, would it be better to use a code oriented dataset to test against, assuming its in the model?
Thanks!