Closed alvarobartt closed 7 months ago
Oops may need some sleep, I saw now that you're doing it that way to keep the attention_mask
including the whole conversation for both the chosen
and the rejected
messages in the chat, so the prompt is required there. Sorry for the misunderstanding!
Oops may need some sleep, I saw now that you're doing it that way to keep the
attention_mask
including the whole conversation for both thechosen
and therejected
messages in the chat, so the prompt is required there. Sorry for the misunderstanding! @alvarobartt Sorry, I dont understand that, whychosen
andrejected
responses need includeprompt
?
Hi here @jiwooya1000!
After exploring a bit the codebase in order to seek a 1:1 reproduction using the
alignment-handbook
(see https://github.com/huggingface/alignment-handbook/pull/143) I've seen that you're adding thegeneration_prompt
.Example
Given the following
prompt
,chosen
andrejected
values:After applying the chat template mappings defined within the code at:
https://github.com/xfactlab/orpo/blob/23964e92cf590f02e281a320714c3498dc47a3b8/main.py#L86-L88
The resulting values after
apply_chat_template
are the following: