stanfordnmbl / osim-rl

Reinforcement learning environments with musculoskeletal models
http://osim-rl.stanford.edu/
MIT License
882 stars 249 forks source link

something about submit and reset #161

Closed Ailsa1994 closed 6 years ago

Ailsa1994 commented 6 years ago

There is something confused me about the example of sbmit.py.

` while True:

print(observation)

[observation, reward, done, info] = client.env_step(env.action_space.sample().tolist())

if done:

    observation = client.env_reset()

    if not observation:

        break`

why should be env_reset when the condition is done? should‘t it be

` while True:

print(observation)

[observation, reward, done, info] = client.env_step(env.action_space.sample().tolist())

if done:

        break`

if then, the video of submition will only be once

kidzik commented 6 years ago

The semantics are the following:

That's true that it's not the most intuitive, but for our very basic workflow, it seems to work ok.

The submission script does not assume the number of trials -- it just follows what the server is sending.