Closed louixp closed 2 years ago
In fact, I don't think the solution is to copy the environment, but to make it resettable.
Gym environment are, generally speaking, not resettable to a particular state (see https://github.com/openai/gym/issues/402). But I agree it could be a good enhancement to make panda_gym resettable. It could work like:
actions = [env.action_space.sample() for _ in range(10)]
sim_state = env.save_state()
for action in actions:
env.load_state(state)
obs, rew, done, info = env.step(action)
# do stuff with the one step lookahead for each action
I don't know precisely how to do it, but the question has already been addressed here: https://pybullet.org/Bullet/phpBB3/viewtopic.php?t=11972
I don't have time to work on this improvement, but if you do, I would welcome a PR.
Thanks for the pointer. They are very helpful! I have created a PR to address this: https://github.com/qgallouedec/panda-gym/pull/33
Describe the bug
I am trying to do a one-step lookahead on the environment's action space by calling
deepcopy()
on the environment to try out different actions. In pseudo-code, it would look something like this:However, I cannot quite do this because
env_copy
gets garbage collected after the first iteration of the loop. The garbage collector closes thephysics_client._client
in thePyBullet
object, which is shared byenv
andenv_copy
. As a result, the second iteration of the loop fails atenv_copy.step(action)
becauseenv_copy
was copied fromenv
which has lost its connection to the physics client.To verify this, you may use the following code on any panda_gym environment.
Is it possible to close the connection based on a reference count implementation? Or would you recommend any other way to do this one-step lookahead?