Closed eyuansu62 closed 1 year ago
Only gsm8k fine tuning.
Have you ever planned to verify your findings on more LLM such as codeLLM? Cause I guess codeLLM may points to different conclusion.
Not actually. We will now focus on 65B and 70B LLaMA to verify more scaling related conclusions.
Does SFT contain instruction tuning process?