huggingface / trl

Train transformer language models with reinforcement learning.
http://hf.co/docs/trl
Apache License 2.0
10.13k stars 1.28k forks source link

[Question] `add_generation_prompt=True` on prompt #2346

Closed Galaxy-Husky closed 1 week ago

Galaxy-Husky commented 1 week ago

Hi,

I noticed that from v0.11.0, maybe_apply_chat_template added generation prompt on the prompt of the example, which was different from the previous version. https://github.com/huggingface/trl/blob/0238d96c6f43abd808804ddb0065883964073060/trl/data_utils.py#L86-L88

Was it a little bug? Will the generation prompt influence the final result?

A prompt reply is appreciated.

qgallouedec commented 1 week ago

Can you point the "previous version" you are refering to?

qgallouedec commented 1 week ago

I think it has been like this from the initial implementation (see #2020)

Galaxy-Husky commented 1 week ago

I think it has been like this from the initial implementation (see #2020)

Sorry, I didn't say that right. I mean before v0.11.0, there was no maybe_apply_chat_template back then. For example, the dpo dataset was preprocessed like: https://github.com/huggingface/trl/blob/55cc4b1076144b74a6ce5d07557b7f664b1de8d9/examples/scripts/dpo.py#L156-L160

Since the code has been refactored , I'm not sure if there was generation prompt or not. If so, could you please point out where it was implemented?

qgallouedec commented 1 week ago

Yes the example code was wrong, you need to add a generation prompt at the end of the prompt.

Galaxy-Husky commented 1 week ago

Yes the example code was wrong, you need to add a generation prompt at the end of the prompt.

I see. Thanks a lot!