Closed alvarobartt closed 4 months ago
I noticed the same thing today, while studying their code. Whether that's deliberate or not, I don't think it will hurt the performance if we won't add response tokens to the prompt
It looks like this change was made in https://github.com/huggingface/alignment-handbook/commit/f0ffa0d7a6ab666b1f80f3f7dbb3c6364ac31967#diff-0668e2e3ee795fdc034f50182f4719a5f8574357831f2e4705fa730ed2db5831L76 by @lewtun but I can't spot an explanation. It looks delivery, so it's probably safe to assume it doesn't affect performance.
Hi @lewtun, friendly pinging you here!
Did you see any performance issue when adding the generation_prompt
as part of the chosen
and rejected
pairs instead of keeping it within the prompt
itself just as the former version? I'll be comparing both approaches, but just wondering whether there's an explanation backing the change, or simply because that worked better during the experiments you ran.
Thanks in advance 🤗
Ok I've already run two full fine-tunes using DPO (similarly to the HuggingFaceH4/zephyr-7b-gemma-v0.1
recipe) and both approaches work similarly, so I guess there are no issues on adding the generation prompt as part of the chosen
and rejected
pairs, see the wandb
screenshot below:
16bit
is the full DPO fine-tune where theadd_generation_prompt=True
and then it's stripped from bothchosen
andrejected
; while16bit-no-gen-prompt
is the full DPO fine-tune whereadd_generation_prompt=False
and thechosen
andrejected
are tokenized normally.
Description
Hi here! 🤗
I was wondering what's the reason under the current approach for preparing the
datasets.Dataset
before DPO fine-tuning, since now it seems that the assistant token is included within thechosen
andrejected
samples, rather than keeping it as part of theprompt
i.e. thetokenizer.apply_chat_template
call on theprompt
does not have theadd_generation_prompt=True
.Shouldn't it be part of the
prompt
so that thechosen
andrejected
is only the response itself?Example for
chosen
Before
prompt
chosen
Now
prompt
chosen
Thanks in advance!