tomekkorbak / pretraining-with-human-feedback

Code accompanying the paper Pretraining Language Models with Human Preferences
https://arxiv.org/abs/2302.08582
MIT License
177 stars 14 forks source link