Previous trajectory is executed when restoring a saved environment

When restoring an environment's state, on a new step, the previous trajectory is also executed so that in the end, it is not possible to start from the same state.

Setup:

from osim.env import ProstheticsEnv
env = ProstheticsEnv(visualize=True, integrator_accuracy=1e-1)  # we quickly want to see what happens
env.reset()

We then save the state at an arbitrary point in time (here at t = 0):

state_checkpoint = env.osim_model.get_state()  # store state
for i in range(50):
    env.step(env.action_space.high)  # execute step with static action

After restoring and executing another step, we get the previous x (in our case 50) steps as well:

env.osim_model.set_state(state_checkpoint)  # restore state
env.step(env.action_space.high)

I also tried shallow-copying the state (copy.deepcopy complains about unpickleable SWIG objects), however, this did not change anything. Setting the state's Y value (the state's internal representation, as fas as I understood) using env.osim_model.state.setY(y_checkpoint) with or without previously setting the state also did not change the outcome. This might be related to SimTK::State's Python interface being slightly buggy but could be unrelated as well.

I am on Windows 10 using the latest OpenSim version (Python 3.6.1) and followed the recommended installation instructions.

Related links: #79, #125, 7ecae69c3cc8021455e3f9a3e207a2689e743929.

stanfordnmbl / osim-rl

Previous trajectory is executed when restoring a saved environment #131