Qeustion about DDPG algorithm

openai / baselines

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

MIT License

15.65k stars 4.86k forks source link

Qeustion about DDPG algorithm #99

Closed wonchul-kim closed 6 years ago

wonchul-kim commented 7 years ago

For halfcheetah simulation,

No need to break when 'done' becomes true (if done:)?????

For example, when i did simulation with Pendulum-v0 i always put 'break' as 'done' gets 'True'.

unixpickle commented 6 years ago

When done is true, that means the episode has ended. At this point, you can either stop using the environment, or you can call reset() to start another episode. DDPG does the latter in order to run multiple episodes.