OpenLLMAI / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
https://openrlhf.readthedocs.io/
Apache License 2.0
1.71k stars 160 forks source link

train_rm apply custom tokenizer chat template #334

Closed mickelliu closed 6 days ago

mickelliu commented 6 days ago

I have a use case where I need to train RM from the base model, it would be helpful to be able to add a chat template here too. Copy paste from train_sft.py.