Cannot reproduce tool_hang results on image observations

Pingcheng-Jian commented 4 months ago

Hi,

I use python robomimic/scripts/generate_paper_configs.py --output_dir /tmp/experiment_results to generate the config for reproducing the results in the paper, and then use python robomimic/scripts/train.py --config robomimic/exps/paper/core/tool_hang/ph/image/bc_rnn.json to run the training process of the tool_hang task on image observations.

However, The best success rate I can get is 24% (at 580th epoch) in this 600 epochs training process. It is far lower than the 67.3 ± 4.1 reported in the robomimic study paper (https://arxiv.org/pdf/2108.03298) in Table 3.

I believe I have strictly followed the instructions in the robomimic document to reproduce the results in the paper. Why the success rate different is so large? Any people met the same issue?

Thanks a lot!

amandlek commented 4 months ago

It'd help to provide some more details. For example, how did you download / postprocess the Tool Hang dataset? What version of robosuite are you on (branch / commit / version)?

Pingcheng-Jian commented 4 months ago

Hi amandlek,

I download the dataset of tool_hang with image observations by: python download_datasets.py --tasks tool_hang --dataset_types ph --hdf5_types image

Then I get the datasets/tool_hang/ph/image_v141.hdf5 dataset.

According to the robomimic document, this dataset does not need any postprocess. Only the raw dataset demo_v141.hdf5 needs post process,

I am using the robomimic version 0.3.0 and robosuite version 1.4.1. I install them by pip install -e . (install from source)

Pingcheng-Jian commented 4 months ago

I think the latest version is v0.3.1. I am using v0.3.0 now.

Do I have to switch v0.3.1 to get the training of tool_hang task correct?

amandlek commented 4 months ago

Thanks for sharing!

You shouldn't be able to download the image Tool Hang dataset for v1.4.1 of robosuite (see this PR). When I run your first command I get "Skipping tool_hang-ph-image, no url for dataset exists. Create this dataset locally by running the appropriate command from robomimic/scripts/extract_obs_from_raw_datasets.sh."

I would recommend two different things to try:

Try downloading the raw hdf5 and then postprocessing to get the image hdf5 (and you can follow up with the commands used here if you want confirmation).
Try downgrading robosuite to the offline_study branch, and then use the Tool Hang image dataset from robomimic v0.2: https://robomimic.github.io/docs/v0.2/datasets/robomimic_v0.1.html

Pingcheng-Jian commented 4 months ago

Hi amandlek,

You are right. Sorry for my mistaken memory, but your comment reminds me that I did fail to download the image_v141.hdf5 and get "Skipping tool_hang-ph-image, no url for dataset exists. Create this dataset locally by running the appropriate command from robomimic/scripts/extract_obs_from_raw_datasets.sh."

Then I did download the raw dataset demo_v141.hdf5, and then I post processed with: python dataset_states_to_obs.py --done_mode 2 --dataset $BASE_DATASET_DIR/tool_hang/ph/demo_v141.hdf5 --output_name image_v141.hdf5 --camera_names sideview robot0_eye_in_hand --camera_height 240 --camera_width 240

Sorry for my mistaken memory since it was a week ago, but I have done the point 1 you recommend me to try.

Which version I should use for robosuite and robomimic for the offline_study branch? Maybe I can use git checkout vx.x.x to get to the correct version?

amandlek commented 4 months ago

For robosuite just checkout the offline_study branch. For robomimic you could checkout v0.2 (https://github.com/ARISE-Initiative/robomimic/releases/tag/v0.2.0)

Pingcheng-Jian commented 4 months ago

Hi amandlek,

Sounds good! I am checking out these branches and re-running the tool_hang training now. I will let you know the results as soon as possible, Thanks for your patient assistance!

Pingcheng-Jian commented 4 months ago

I am now in the offline_study branch for robosuite and 'v0.2.0' for robomimic.

Do I need to delete the image_v141.hdf5 file that I previously generated with postprocess command in robomimic v0.3.0 and robosuite v1.4.1, and then download the image_v141.hdf5 for this robomimic v0.2.0 and robosuite offline_study?

amandlek commented 4 months ago

yes

Pingcheng-Jian commented 4 months ago

I try to git checkout offline_study for the robosuite, and then do pip install -e. Then I get this error:

` Building wheels for collected packages: mujoco-py Building wheel for mujoco-py (pyproject.toml) ... error error: subprocess-exited-with-error

× Building wheel for mujoco-py (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [71 lines of output] running bdist_wheel running build

  You appear to be missing MuJoCo.  We expected to find the file here: /home/pj81/.mujoco/mujoco200

  This package only provides python bindings, the library must be installed separately.

  Please follow the instructions on the README to install MuJoCo

      https://github.com/openai/mujoco-py#install-mujoco

  Which can be downloaded from the website

      https://www.roboti.us/index.html

`

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for mujoco-py Failed to build mujoco-py ERROR: Could not build wheels for mujoco-py, which is required to install pyproject.toml-based projects

How can I install the mujoco-py?

Pingcheng-Jian commented 4 months ago

Can I use robomimic v0.2.0 and robosuite v1.4.1, instead of robosuite offline_study? If I don't use mujoco-py, can I use mujoco package? Which mujoco version should be fine?

I use mujoco 3.2.0 now. Do I need to use older version mujoco, such as mujoco==2.3.2 required by the mimicgen package?

amandlek commented 4 months ago

When you downgrade robosuite, you shouldn't need to re-install it. I would make sure mujoco-py is uninstalled, and I'd stick with mujoco==2.3.2 if possible.

Pingcheng-Jian commented 4 months ago

I see. I will only use git checkout offline_study in the robosuite package, but I will not run pip3 install -r requirements.txt again.

I will use mujoco==2.3.2.

I will let you know what happens training in this way. Thanks.

amandlek commented 4 months ago

I apologize - I just realized that the offline_study branch of robosuite is likely still using the mujoco-py backend. You will likely need to install mujoco-py. For the error you mentioned, you should make a directory (.mujoco) in your home directory, and add the unzipped mujoco library there

Pingcheng-Jian commented 4 months ago

Sure, I will try that.

ARISE-Initiative / robomimic

Cannot reproduce tool_hang results on image observations #178