qgallouedec / panda-gym

Set of robotic environments based on PyBullet physics engine and gymnasium.
MIT License
506 stars 109 forks source link

Physics client closed during environment destruction #32

Closed louixp closed 2 years ago

louixp commented 2 years ago

Describe the bug

I am trying to do a one-step lookahead on the environment's action space by calling deepcopy() on the environment to try out different actions. In pseudo-code, it would look something like this:

actions = [env.action_space.sample() for _ in range(10)]
for action in actions:
  env_copy = deepcopy(env)
  obs, rew, done, info = env_copy.step(action)
  # do stuff with the one step lookahead for each action

However, I cannot quite do this because env_copy gets garbage collected after the first iteration of the loop. The garbage collector closes the physics_client._client in the PyBullet object, which is shared by env and env_copy. As a result, the second iteration of the loop fails at env_copy.step(action) because env_copy was copied from env which has lost its connection to the physics client.

To verify this, you may use the following code on any panda_gym environment.

import deepcopy

env_copy = deepcopy(env)
assert(id(env.sim.physics_client._client) == id(env_copy.sim.physics_client._client))
del env_copy
env.reset() # error: Not connected to physics server.

Is it possible to close the connection based on a reference count implementation? Or would you recommend any other way to do this one-step lookahead?

qgallouedec commented 2 years ago

In fact, I don't think the solution is to copy the environment, but to make it resettable.

Gym environment are, generally speaking, not resettable to a particular state (see https://github.com/openai/gym/issues/402). But I agree it could be a good enhancement to make panda_gym resettable. It could work like:

actions = [env.action_space.sample() for _ in range(10)]
sim_state = env.save_state()
for action in actions:
  env.load_state(state)
  obs, rew, done, info = env.step(action)
  # do stuff with the one step lookahead for each action

I don't know precisely how to do it, but the question has already been addressed here: https://pybullet.org/Bullet/phpBB3/viewtopic.php?t=11972

I don't have time to work on this improvement, but if you do, I would welcome a PR.

louixp commented 2 years ago

Thanks for the pointer. They are very helpful! I have created a PR to address this: https://github.com/qgallouedec/panda-gym/pull/33