[Question] EOS in reward model dataset

OpenLLMAI / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

https://openrlhf.readthedocs.io/

Apache License 2.0

1.71k stars 160 forks source link

[Question] EOS in reward model dataset #300

Open qwenzo opened 1 month ago

qwenzo commented 1 month ago

Hi,

first of all thank you very much for the repo! I would like to ask, is the EOS token in the reward model dataset necessary for the model? I'm using a gpt2 model where EOS is used for BOS and EOS is not usually used. That's why I was wondering if maybe this token is needed for reward modeling or is it model specific?

https://github.com/OpenLLMAI/OpenRLHF/blob/072e286a5c5f3cd6acf2c9ad7e4ef727a8dedb83/openrlhf/datasets/reward_dataset.py#L148

Thank you!

hijkzzz commented 1 month ago

Usually, we use an EOS token to output the reward value. So we did not consider the situation of GPT-2

qwenzo commented 1 month ago

Hi, thank you for the reply. What is a possible way to fix this? Should I use a new special token?

KunBB commented 4 weeks ago

Hi, thank you for the reply. What is a possible way to fix this? Should I use a new special token?

You can try to allocate an unused special token instead.