Closed littleSunlxy closed 6 months ago
Your chat model looks great! How did you choose the datasets while finetuning on OpenAssisant repo? Otherwise, your chat model finetuning only include sft or also include RW RL training?
They Perform a regular finetune on a40's with Oasst ChatMl, Theres also some DPO versions
Your chat model looks great! How did you choose the datasets while finetuning on OpenAssisant repo? Otherwise, your chat model finetuning only include sft or also include RW RL training?