haosulab / ManiSkill

SAPIEN Manipulation Skill Framework, a GPU parallelized robotics simulator and benchmark
https://maniskill.ai/
Apache License 2.0
767 stars 140 forks source link

Missing VecEnv Wrapper and inconsistent observation shapes in example evaluation script "run_evaluation.py" #189

Closed lukas-ruettgers closed 8 months ago

lukas-ruettgers commented 8 months ago

Missing VecEnv Wrapper

For submission, you provided the run_evaluation.py and evaluator.py scripts in the ManiSkill2 repository (/mani_skill2/evaluation/).

In the Evaluation class in your ManiSkill2-Learn repository (maniskill2_learn/env/evaluation.py), you correctly wrap the raw OpenAI gym environment with your MS2-specific wrappers (UnifiedVectorEnvAPI<VectorEnvBase<BufferAugmentedEnv<ExtendedEnv<TimeLimit<ManiSkill2_ObsWrapper<RenderInfoWrapper<...>>>>>>).

But in the evaluator.py script, you do not wrap the raw Gymnasium environment with these wrappers.

self.env: BaseEnv = gym.make(
    self.env_id,
    obs_mode=obs_mode,
    control_mode=control_mode,
    render_mode=render_mode,
    **self.env_kwargs
)
self.policy = policy_cls(
    self.env_id, self.env.observation_space, self.env.action_space
)
self.result = OrderedDict()

This is problematic because our trained models rely on the frame transformations, normalizations, downsampling, and shape adjustments of the observations.

Therefore, we added these wrappers in our code, but this is definitely not desirable.

Inconsistent observation shapes

Furthermore, the values in self.env.observation_space are inconsistent with those provided by get_env_info() in the maniskill2_learn.env folder.

To obtain the size of the state vector, we summed up the sizes of all corresponding vectors in self.env.observation_space in the following fashion.

state_shape = 0
agent_space = observation_space['agent']
extra_space = observation_space['extra']
for space in [agent_space, extra_space]:
    for key in space.keys():
        if len(space[key].shape) == 0:
            state_shape += 1
        else:
            state_shape += space[key].shape[0]

However, the final value in state_shape exceeds the true value provided by get_env_info() by 8 in case of the StackCube-v0 and TurnFaucet-v0 environments. Of course, we replaced the env_name in our environment config env_cfg by the value in env_id.

env_cfg.env_name = env_id
env_params = get_env_info(env_cfg)
obs_shape = env_params["obs_shape"]
action_shape = env_params["action_shape"]

For the StackCube-v0 environment, we have state_shape=32, while the value in env_params is 24. For the TurnFaucet-v0 environment, we similarly have state_shape=39, while the value in env_params is 31.

The latter way of computing the observation shapes with get_env_info() is also used in your aforementioned Evaluation class and is hence consistent with the values received during training. In our user_solution.py, we thus ignored the observation_spaces parameter and instead fetched the space specifications from get_env_info().

But we think this might possibly be an error in your example script.

xuanlinli17 commented 8 months ago

Hi, thanks for the feedback.

The evaluation script in ManiSkill2 repo is a general script that doesn't bind to a specific learning repo. Indeed, if you use a specific learning repo like ManiSkill2-Learn, you need to write your own user_solution.py, and then use the evaluation script in ManiSkill2 to evaluate it.

We used to have a submission_example in ManiSkill2-Learn, but since we are moving to gymnasium and updating our repo, it was since removed from the main branch. You can view it here: https://github.com/haosulab/ManiSkill2-Learn/tree/ms2_gym/submission_example

lukas-ruettgers commented 8 months ago

Thank you for your fast reply!