ARISE-Initiative / robomimic

robomimic: A Modular Framework for Robot Learning from Demonstration
MIT License
595 stars 182 forks source link

issues about the state_dim mismatch of the dataset and the enviroment #115

Closed codyuan closed 10 months ago

codyuan commented 10 months ago

I am new to use robomimic. And I want to use the dataset provided like http://downloads.cs.stanford.edu/downloads/rt_benchmark/lift/mh/low_dim_v141.hdf5. However, I didn't find effective approach to reproduce the same environment as the data is collected. Specifically, we use this code:

config = config_factory("bc") ObsUtils.initialize_obs_utils_with_config(config)

env_meta = FileUtils.get_env_metadata_from_dataset(my_path_to_hdf5) env=gym.make("Lift")

env = EnvUtils.create_env_from_metadata( env_meta=env_meta, env_name=env_meta["env_name"], render=False, render_offscreen=False, use_image_obs=False, ) env = EnvUtils.wrap_env_from_config(env, config=config)

In this way, the env always return me a dict rather a 32-dim array in hdf5 when i call env.reset(). I wonder how to deal with this, cause same issues happened in other datasets you provided.

amandlek commented 10 months ago

This is because observations in robosuite and robomimic are dictionaries, not flat numpy arrays. I would encourage you to read some of our documentation and try some of our examples - I am posting some relevant links below.

Information on observations: https://robomimic.github.io/docs/tutorials/observations.html Information on robosuite datasets and how simulator states can be used to extract observations: https://robomimic.github.io/docs/datasets/robosuite.html Dataset structure: https://robomimic.github.io/docs/datasets/overview.html#dataset-structure

codyuan commented 10 months ago

I have read your links and tried some of them. However, still I cannot make env return me the flat state I need. Specifically, I thought the flat state in dataset is extracted by concat some values in the obs_dict. The problem there is I do not know the corresponding keys of the values needed. How could I extract them from dataset file or in other way so that I can make my env work? I didn't find clues. Could you please provide more specific case to follow? It really confused me.

codyuan commented 10 months ago

I found a wrapper in robosuite (https://github.com/ARISE-Initiative/robosuite/blob/v1.4.1/robosuite/wrappers/gym_wrapper.py) could be used to generate flatten obs by extracting keys from obs_dict. It seems that it could extract flat obs with same dims as obs collected in http://downloads.cs.stanford.edu/downloads/rt_benchmark/lift/mh/low_dim_v141.hdf5.

My code to reproduce the env is now:

env = robosuite.make( "Lift", robots=["Panda"], # load a Sawyer robot and a Panda robot has_renderer=False, # no on-screen rendering has_offscreen_renderer=False, # no off-screen rendering control_freq=20, # 20 hz control for applied actions horizon=200, # each episode terminates after 200 steps use_object_obs=False, # provide object observations to agent use_camera_obs=False, # don't provide image observations to agent reward_shaping=True, # use a dense reward signal for learning ) env=GymWrapper(env)

Is this the right way? I am uncertain about it because I found the action_dim of this kind of env (8 dim) is different with actions contained in the dataset (7 dim). Same issue happened to another env, Square. In Square, the state dim is also different (32 dim for env and 45 dim for dataset file).

amandlek commented 10 months ago

I'm a little confused on what exactly you are trying to do. The dataset hdf5 contains observations in an h5py Group - for example if you open the hdf5 and then see f["data/demo_0/obs"].keys() that will give you a list of the observations in the hdf5 - these correspond to keys in the observation dictionary. A subset of these keys are provided to the agent during training - these are specified in the training config (for example, here).

codyuan commented 10 months ago

I am trying to implement or reproduce an env that return me the states (flat numpy array) that is aligned to the ones in dataset http://downloads.cs.stanford.edu/downloads/rt_benchmark/lift/mh/low_dim_v141.hdf5, when I call functions like reset() or step(). And I don't know how these states are extracted from observation dictionary. It is from several keys in the dictionary? I checked the link you provided above and find that the keys in "low_dim" seem not able to extract the state matched to the one in the hdf5 dataset.

amandlek commented 10 months ago

I'm guessing that you are talking about the actual MuJoCo simulator state - this is located in f["data/demo_0/states"] for example. Again, the dataset structure is documented here. I would not recommend training using these states as observations - they are typically just used for resetting the simulator to do certain operations such as trajectory playback and observation extraction.

amandlek commented 10 months ago

If you want to get the current simulator state from the environment, you can use this function.

codyuan commented 10 months ago

Thanks a lot! This function works for me. Since you said you "would not recommend training using these states as observations", so whether the better way is to train an encoder dealing with the raw obs dictionary?

amandlek commented 10 months ago

Yes it is recommended to use the observations, not the low-level states (since those are pretty much meant for just resetting the simulation).