Closed tothemoon96 closed 8 months ago
Can you provide a minimal reproducible example with the default RM script? That would be helpful, thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Buggy output
System Info
Information
Tasks
no_trainer
script in theexamples
folder of thetransformers
repo (such asrun_no_trainer_glue.py
)Reproduction
I have packaged my environment in
tothemoon/temp:20230917
After enter docker environment, please clone
https://github.com/tothemoon96/rlhf.git
Reproduction
Expected behavior
The normal run
train_rm.py
commented inscript/rm_test.sh
should be idientical totrain_rm_bug.py
without exceptions