nottombrown / rl-teacher

Code for Deep RL from Human Preferences [Christiano et al]. Plus a webapp for collecting human feedback
MIT License
556 stars 93 forks source link

Refactor predictor #6

Closed nottombrown closed 7 years ago