Run trpo_swimmer in stub mode

zhuojw10 commented 7 years ago

''python example/trpo_swimmer.py'' works well. In the default setting, after 40 iterations it produces 55.72 average reward.

When I try to run trpo_swimmer.py in the ''stub'' mode (I simply add ''stub(globals())'' at the begining and replace ''algo.train()'' with ''run_experiment_lite(...)" just following ddpg_cartpole and ddpg_cartpole_stub), it still work. However, in the same default setting, it produces 49.59 average reward. I try different random SEED the difference remained.

I'm wondering why the difference exists?

dementrock commented 7 years ago

Try setting the scale_reward option in DDPG to 0.1 or 0.01.

zhuojw10 commented 7 years ago

@dementrock Thanks

rll / rllab

Run trpo_swimmer in stub mode #65