Closed nooobodynose closed 5 months ago
Hello @nooobodynose, the second option is the proper way to prepare the dataset for ORPOTrainer in TRL!
For further clarification, the prompt could be preprocessed through tokenizer.apply_chat_template(..., tokenize=False, add_generation_prompt=True)
as in your example🙂
Thanks for your quick response, it is clear now : ) Great work 👍
Hey!
Trying to understand the prompt format we need to prepare for ORPOTrainer in
trl
.In trl ORPO doc it presents the dataset in this format (shortened for visibility):
Let's say we're preparing the ORPO dataset for Mixtral / Mistral Instruct:
should we pass
train_dataset
param toORPOTrainer
as :[1] Plain Messages
or
[2] Messages formatted in its chat template.
Thanks in advance for your clarification : )