issues
search
nottombrown
/
rl-teacher
Code for Deep RL from Human Preferences [Christiano et al]. Plus a webapp for collecting human feedback
MIT License
556
stars
93
forks
source link
Refactor comparison_collectors and label_schedules into modules
#2
Closed
nottombrown
closed
7 years ago