issues
search
nottombrown
/
rl-teacher
Code for Deep RL from Human Preferences [Christiano et al]. Plus a webapp for collecting human feedback
MIT License
556
stars
93
forks
source link
Back to pooled rollouts, but this time with random seed set using wor…
#41
Open
ghost
opened
5 years ago
ghost
commented
5 years ago
…ker index.
…ker index.