In the apply_chat_template function used for DPO training, there appears to be an issue where generation_prompt is added even when add_generation_prompt is not set to True. This results in repeated assistant turns in the Llama template, potentially affecting the training outcomes.
Steps to Reproduce
Apply the apply_chat_template function as follows:
Review the outputs in different parts of the dataset.
Expected Behavior
The function should not add generation_prompt to the outputs unless explicitly set by add_generation_prompt=True.
Observed Behavior
The outputs include repeated assistant turns in the Llama template, as shown in the examples below:
Prompt sample 14592 of the raw training set:
<|begin_of_text|><|start_header_id|>user<|end_header_id|>
xxxxxx<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Chosen sample 14592 of the raw training set:
<|begin_of_text|><|start_header_id|>assistant<|end_header_id|>
xxxxxx<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Rejected sample 14592 of the raw training set:
<|begin_of_text|><|start_header_id|>assistant<|end_header_id|>
xxxx<|eot_id|><|start_header_id|>assistant<|end_header_id|>
This repetition of the assistant's turn <|start_header_id|>assistant<|end_header_id|> appears irrespective of the setting of add_generation_prompt.
Additional Information
It is unclear whether this behavior affects the training outcomes negatively, but it certainly alters the intended structure of the training data.
No custom modifications were made to the function; the issue persists with the default implementation.
Please investigate this issue as it might be influencing the training process negatively. Any guidance on the expected outputs and how to correctly use the apply_chat_template would also be appreciated.
Description
In the
apply_chat_template
function used for DPO training, there appears to be an issue wheregeneration_prompt
is added even whenadd_generation_prompt
is not set toTrue
. This results in repeated assistant turns in the Llama template, potentially affecting the training outcomes.Steps to Reproduce
apply_chat_template
function as follows:Expected Behavior
The function should not add
generation_prompt
to the outputs unless explicitly set byadd_generation_prompt=True
.Observed Behavior
The outputs include repeated assistant turns in the Llama template, as shown in the examples below:
This repetition of the assistant's turn
<|start_header_id|>assistant<|end_header_id|>
appears irrespective of the setting ofadd_generation_prompt
.Additional Information
Please investigate this issue as it might be influencing the training process negatively. Any guidance on the expected outputs and how to correctly use the
apply_chat_template
would also be appreciated.