the dataset selection sft on OpenAssisant

jzhang38 / TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Apache License 2.0

7.31k stars 426 forks source link

the dataset selection sft on OpenAssisant #111

Closed littleSunlxy closed 6 months ago

littleSunlxy commented 6 months ago

Your chat model looks great! How did you choose the datasets while finetuning on OpenAssisant repo? Otherwise, your chat model finetuning only include sft or also include RW RL training?

VatsaDev commented 6 months ago

They Perform a regular finetune on a40's with Oasst ChatMl, Theres also some DPO versions