Closed nottombrown closed 7 years ago
python rl_teacher/teach.py -w 4 -a parallel_trpo -p synth -l 700 -e Reacher-v1 -n debug-ppo/trpo-synth-64-700-s1 -V -s 1
Although 700 label performance may need some tuning
TRPO matches performance from before
python rl_teacher/teach.py -w 4 -a parallel_trpo -p synth -l 700 -e Reacher-v1 -n debug-ppo/trpo-synth-64-700-s1 -V -s 1
PPO is also learning well
Although 700 label performance may need some tuning