issues
search
nottombrown
/
rl-teacher
Code for Deep RL from Human Preferences [Christiano et al]. Plus a webapp for collecting human feedback
MIT License
559
stars
95
forks
source link
Collect snapshots of agent parameters to allow sharing of trained agents
#23
Closed
nottombrown
closed
7 years ago
nottombrown
commented
7 years ago
It seems like this should be easy to get working with PPO
It seems like this should be easy to get working with PPO