Implemented methods to save and restore PyBullet states.

louixp commented 2 years ago

This PR is to address the feature discussed in https://github.com/qgallouedec/panda-gym/issues/32.

louixp commented 2 years ago

I have added an example of a greedy random search to the documentation. PTAL

qgallouedec commented 2 years ago

This looks great.

Now you need to update the index.rst to make this section of the documentation visible in the index. You'll also need to add a unit test function. Add a new file in test called state_test.py. I think one test function for the all the three new methods should be enough.

If you want some help, feel free to ask.

louixp commented 2 years ago

Thanks! I have added the unit tests. However, I'm not super familiar with pytest and it's giving me ModuleNotFoundError for panda_gym in all test suites locally. Do you know what I could have done wrong?

qgallouedec commented 2 years ago

To use pytest, install it in your virtual env:

pip install pytest

Then just run

pytest

louixp commented 2 years ago

Thanks! It seems like a bunch of tests are failing. I reproduced this by copying a fresh copy of the repo. Here is the error message:

=========================================================================================================================== short test summary info ============================================================================================================================
FAILED test/envs_test.py::test_reach - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (3,), float32), desired_goal: Box(-10.0, 10.0, (3,), float32), observation: Box(-10...
FAILED test/envs_test.py::test_slide - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (3,), float32), desired_goal: Box(-10.0, 10.0, (3,), float32), observation: Box(-10...
FAILED test/envs_test.py::test_push - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (3,), float32), desired_goal: Box(-10.0, 10.0, (3,), float32), observation: Box(-10....
FAILED test/envs_test.py::test_pickandplace - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (3,), float32), desired_goal: Box(-10.0, 10.0, (3,), float32), observation: ...
FAILED test/envs_test.py::test_stack - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (6,), float32), desired_goal: Box(-10.0, 10.0, (6,), float32), observation: Box(-10...
FAILED test/envs_test.py::test_flip - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (4,), float32), desired_goal: Box(-10.0, 10.0, (4,), float32), observation: Box(-10....
FAILED test/envs_test.py::test_dense_reach - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (3,), float32), desired_goal: Box(-10.0, 10.0, (3,), float32), observation: B...
FAILED test/envs_test.py::test_dense_slide - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (3,), float32), desired_goal: Box(-10.0, 10.0, (3,), float32), observation: B...
FAILED test/envs_test.py::test_dense_push - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (3,), float32), desired_goal: Box(-10.0, 10.0, (3,), float32), observation: Bo...
FAILED test/envs_test.py::test_dense_pickandplace - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (3,), float32), desired_goal: Box(-10.0, 10.0, (3,), float32), observa...
FAILED test/envs_test.py::test_dense_stack - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (6,), float32), desired_goal: Box(-10.0, 10.0, (6,), float32), observation: B...
FAILED test/envs_test.py::test_dense_flip - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (4,), float32), desired_goal: Box(-10.0, 10.0, (4,), float32), observation: Bo...
FAILED test/envs_test.py::test_reach_joints - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (3,), float32), desired_goal: Box(-10.0, 10.0, (3,), float32), observation: ...
FAILED test/envs_test.py::test_slide_joints - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (3,), float32), desired_goal: Box(-10.0, 10.0, (3,), float32), observation: ...
FAILED test/envs_test.py::test_push_joints - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (3,), float32), desired_goal: Box(-10.0, 10.0, (3,), float32), observation: B...
FAILED test/envs_test.py::test_pickandplace_joints - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (3,), float32), desired_goal: Box(-10.0, 10.0, (3,), float32), observ...
FAILED test/envs_test.py::test_stack_joints - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (6,), float32), desired_goal: Box(-10.0, 10.0, (6,), float32), observation: ...
FAILED test/envs_test.py::test_flip_joints - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (4,), float32), desired_goal: Box(-10.0, 10.0, (4,), float32), observation: B...
FAILED test/envs_test.py::test_dense_reach_joints - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (3,), float32), desired_goal: Box(-10.0, 10.0, (3,), float32), observa...
FAILED test/envs_test.py::test_dense_slide_joints - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (3,), float32), desired_goal: Box(-10.0, 10.0, (3,), float32), observa...
FAILED test/envs_test.py::test_dense_push_joints - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (3,), float32), desired_goal: Box(-10.0, 10.0, (3,), float32), observat...
FAILED test/envs_test.py::test_dense_pickandplace_joints - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (3,), float32), desired_goal: Box(-10.0, 10.0, (3,), float32), ...
FAILED test/envs_test.py::test_dense_stack_joints - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (6,), float32), desired_goal: Box(-10.0, 10.0, (6,), float32), observa...
FAILED test/envs_test.py::test_dense_flip_joints - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (4,), float32), desired_goal: Box(-10.0, 10.0, (4,), float32), observat...
FAILED test/seed_test.py::test_seed_reach - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (3,), float32), desired_goal: Box(-10.0, 10.0, (3,), float32), observation: Bo...
FAILED test/seed_test.py::test_seed_push - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (3,), float32), desired_goal: Box(-10.0, 10.0, (3,), float32), observation: Box...
FAILED test/seed_test.py::test_seed_slide - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (3,), float32), desired_goal: Box(-10.0, 10.0, (3,), float32), observation: Bo...
FAILED test/seed_test.py::test_seed_pick_and_place - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (3,), float32), desired_goal: Box(-10.0, 10.0, (3,), float32), observ...
FAILED test/seed_test.py::test_seed_stack - AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(achieved_goal: Box(-10.0, 10.0, (6,), float32), desired_goal: Box(-10.0, 10.0, (6,), float32), observation: Bo...
================================================================================================================== 29 failed, 30 passed, 87 warnings in 5.90s ==================================================================================================================

qgallouedec commented 2 years ago

Yes, these errors come from the latest version of gym. I solved the problem yesterday on the master branch. I just included these changes to your branch. Pull the changes, force reinstall gym (pip install gym==0.23) and these problems should be solved.

louixp commented 2 years ago

Awesome thanks! I fixed some small errors in tests, but everything should be good now. Everything is green locally.

qgallouedec commented 2 years ago

I just thought of something: Logically, restore_state should also restore from desired goal, right?

louixp commented 2 years ago

Is the desired goal not an object where the state is captured in pybullet?

qgallouedec commented 2 years ago

No. It is the opposite. A target position is sampled, and a fake object (just for rendering, agent can't interact with it) is placed in the simulation.

qgallouedec commented 2 years ago

In my opinion, this should work:

env = PandaReachEnv()
env.reset()

state_id = env.save_state()

# Perform the action
action = env.action_space.sample()
next_obs1, reward, done, info = env.step(action)

# Restore and perform the same action
env.reset()
env.restore_state(state_id)
next_obs2, reward, done, info = env.step(action)

# The observations in both cases should be equals
assert np.all(next_obs1["achieved_goal"] == next_obs2["achieved_goal"])
assert np.all(next_obs1["observation"] == next_obs2["observation"])
assert np.all(next_obs1["desired_goal"] == next_obs2["desired_goal"])

louixp commented 2 years ago

I see what you mean. I didn't do assertion for the desired goal since it cannot change during an episode, but I could add that.

louixp commented 2 years ago

Done!

qgallouedec commented 2 years ago

I think I explained it wrong: Consider that I save the state when the environment has goal A. I reset the environment, thus a goal B is sampled. Now I restore the saved state, I would like the goal to be A again.

I think this can be done by storing the goals in a dictionary that associates state_id with the goal.

qgallouedec commented 2 years ago

maybe something like

def save_state(self) -> int:
    state_id = self.sim.save_state()
    self._saved_goal[state_id] = self.task.goal
    return state_id

def restore_state(self, state_id: int) -> None:
    self.sim.restore_state(state_id)
    self.task.goal = self._saved_goal[state_id]

def remove_state(self, state_id: int) -> None:
    self._saved_goal.pop(state_id)
    self.sim.remove_state(state_id)

louixp commented 2 years ago

I see! Just pushed the change.

qgallouedec commented 2 years ago

Useful trick to help you formatting your code: Install black and isort (pip install black isort) Then run

black <your-directory>
isort <your-directory>

Here:

black -l 127 panda_gym test
isort -l 127 panda_gym test

-l 127 means that a line can contain 127 characters.

louixp commented 2 years ago

Thanks!

qgallouedec commented 2 years ago

Thank you for contributing, your changes have been included in the version 2.0.4 :)

qgallouedec / panda-gym

Implemented methods to save and restore PyBullet states. #33