Back to pooled rollouts, but this time with random seed set using wor…

nottombrown / rl-teacher

Code for Deep RL from Human Preferences [Christiano et al]. Plus a webapp for collecting human feedback

MIT License

556 stars 93 forks source link

Open ghost opened 5 years ago

ghost commented 5 years ago

…ker index.