Closed sanagno closed 1 year ago
Set a same global seed maybe?
I have a random generator with specific seed for the dataset only https://github.com/LAION-AI/Open-Assistant/blob/main/model/model_training/custom_datasets/prompt_dialogue.py#L31
Hi! I am an industry research scientist and I am interested in contributing to these problems. Am I correct in my understanding that we want the dataset splits 'sft' and 'reward_model' to be identical while 'rl' is a different split. If so I think I can solve this problem by creating a fixed order of indexs using the random state and then using the same slice of indexs for 'sft' and 'reward_model' but a different slice for 'rl' . With this update making 'sft' and 'reward_model' the same split, the split sizes will need to be changed too so that all the data is used.
hey @bethanyconnolly yes feel free to do this. Btw @theblackcat102 if you are doing any special preprocessing please push the script, I am missing the history tags.
I think this was not meant to be closed by #1776 so I am reopening
@sanagno by history tag, are you referring to the reward model? or the RLHF training?
@sanagno sorry that’s on my fault, it’s a quick hack to include historical conversation into the question section ( first sentence ). Should we use the question, answer tag instead?
Nono its fine, we can just add it to the dataset init or just the README to be easy to follow the instructions to recreate it
Hi, I am feeling a bit confused about how to handle this issue now. Perhaps it can be unassigned and given to someone else with better context on the problem? Thanks
We need to make sure that different splits of the dataset are used for sft, reward and rl training.
Basically sft_dataset and reward datasets needs to use the same splits.