yamatokataoka / learning-from-human-preferences

Replication of Deep Reinforcement Learning from Human Preferences (Christiano et al, 2017).
MIT License
2 stars 0 forks source link

set up rl-human-prefs #9

Closed yamatokataoka closed 2 years ago

yamatokataoka commented 2 years ago