nottombrown / rl-teacher

Code for Deep RL from Human Preferences [Christiano et al]. Plus a webapp for collecting human feedback
MIT License
559 stars 93 forks source link

Configure system to use unmodified environments #12

Open nottombrown opened 7 years ago

nottombrown commented 7 years ago

In order to compare performance with existing baselines, we should have the default use exactly the same environment semantics as the standard gym env

e.g. the following should match the normal Reacher-v1 behavior

python rl_teacher/teach.py -p rl -e Reacher-v1

If you want to use modified envs you can use ReacherNoTorque-v1 and ReacherNoTorqueFixedLength-v1