Closed garymcintire closed 7 years ago
Hey Gary, as in Deep RL from Human Preferences, we remove the done
signals.
You can see the envs.py
file for details.
I'd be interested in accepting PRs that make it easy to run the unmodified environments as well as the modified ones.
See the following issue: https://github.com/nottombrown/rl-teacher/issues/5
Thanks for clarifying
I'm leaving this open because it's a separate issue from #5
Ah, actually this is already an open issue. Closing in favor of #12
I try this and watch the movies
python -u rl_teacher/teach.py -p rl -e Humanoid-v1 -n base-rl -w 12
It always runs the full 1000 steps. Putting in a print statement in rollouts.py shows that the env.step never returns a 'done'
Is it supposed to be like this? If so, why?