PPO data en - Githubissues

Hello!

I really liked your work on RLHF research. A very clear description in the technical report and a good implementation. I studied the entire code in detail and read the article several times.

For my research, I would like to reproduce your results, but I can't find a dataset with English prompts that was used in the PPO algorithm. In the article you write that a manually collected dataset was used, but I can't find it anywhere. Could you share this dataset so I can run your code, please?

Thanks

OpenLMLab / MOSS-RLHF

PPO data en #27