haosulab / ManiSkill

SAPIEN Manipulation Skill Framework, a GPU parallelized robotics simulator and benchmark
https://maniskill.ai/
Apache License 2.0
807 stars 146 forks source link

[BUG] Unsuccessfully replay trajectory in task "StackCube_v1" #601

Closed idonotlikemondays closed 2 days ago

idonotlikemondays commented 1 week ago

Hi,

1. Unsuccessfully replay trajectory in task "StackCube_v1" I tried to replay the trajectory of PushCube-v1 but failed.

python -m mani_skill.trajectory.replay_trajectory \
  --traj-path ~/.maniskill/demos/PushCube-v1/motionplanning/trajectory.h5 \
  --use-first-env-state -c pd_ee_delta_pos -o state \
  --save-traj --num-procs 10 -b cpu --record-rewards True --reward-mode="normalized_dense"

I tried pip uninstall pytorch_kinematics_ms and reinstalled and upgraded to the latest version of ManiSkill (3.0.0b10), but it still doesn't work. I would like to ask if you have any suggestions to solve this problem?

2. Low convert rate and worse performance in pd_ee_delta_pos compare to pd_joint_delta_pos By the way, among the five tasks ['PickCube-v1', 'PushCube-v1', 'StackCube-v1', 'PegInsertionSide-v1', 'PlugCharger-v1'], with pd_ee_delta_pos as the control mode, except for PickCube which has a 100% success rate, PushCube also has a relatively high failure rate. Additionally, the results of running diffusion policy at the end are not ideal (as shown in the figure). The remaining three tasks all display "not replayed successfully." Do you have any effective solutions to address this issue? Snipaste_2024-10-03_15-01-32 (The former is pd_joint_delta_pos, and the latter is pd_ee_delta_pos. Theoretically, the latter should be easier to learn than the former?)

Thanks a lot for your help!!

Best, Zhenyu

The following is the error message for running the trajectory related to replay PushCube-v1:

logger.warn(
                                                                                          multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):                                                        
  File "/home/ubuntu/.conda/envs/zhenyu_maniskill/lib/python3.9/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/ubuntu/.conda/envs/zhenyu_maniskill/lib/python3.9/multiprocessing/pool.py", line 51, in starmapstar                                                                          
    return list(itertools.starmap(args[0], args[1]))
  File "/home/ubuntu/.conda/envs/zhenyu_maniskill/lib/python3.9/site-packages/mani_skill/trajectory/replay_trajectory.py", line 487, in _main
    ori_env.set_state_dict(ori_env_states[0])
  File "/home/ubuntu/.conda/envs/zhenyu_maniskill/lib/python3.9/site-packages/mani_skill/envs/sapien_env.py", line 1086, in set_state_dict
    self.scene.set_sim_state(state, env_idx)
  File "/home/ubuntu/.conda/envs/zhenyu_maniskill/lib/python3.9/site-packages/mani_skill/envs/scene.py", line 799, in set_sim_state
    self.articulations[art_id].set_state(art_state, None)
KeyError: 'panda'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/ubuntu/.conda/envs/zhenyu_maniskill/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/ubuntu/.conda/envs/zhenyu_maniskill/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/ubuntu/.conda/envs/zhenyu_maniskill/lib/python3.9/site-packages/mani_skill/trajectory/replay_trajectory.py", line 613, in <module>
    main(parse_args())
  File "/home/ubuntu/.conda/envs/zhenyu_maniskill/lib/python3.9/site-packages/mani_skill/trajectory/replay_trajectory.py", line 594, in main
    res = pool.starmap(_main, proc_args)
  File "/home/ubuntu/.conda/envs/zhenyu_maniskill/lib/python3.9/multiprocessing/pool.py", line 372, in starmap
    return self._map_async(func, iterable, starmapstar, chunksize).get()
  File "/home/ubuntu/.conda/envs/zhenyu_maniskill/lib/python3.9/multiprocessing/pool.py", line 771, in get
    raise self._value
KeyError: 'panda'
StoneT2000 commented 1 week ago

Thanks for the issue. I see some of the datasets are outdated and need to be regenerated due to a small change in the environment.

I'll also investigate the low conversion success rates

idonotlikemondays commented 1 week ago

Thanks for the issue. I see some of the datasets are outdated and need to be regenerated due to a small change in the environment.

I'll also investigate the low conversion success rates

Thanks a lot! I really appreciate your help! I also want to ask if it's possible to add demonstrations for more tasks? e.g pushT or TwoRobot related tasks?

StoneT2000 commented 1 week ago

Yeah we plan to. Some people will work on writing motion planning solutions / generating teleop demos for more tasks over time.

StoneT2000 commented 1 week ago

PushCube replay works as intended.

StackCube demos are re-uploaded now.

StoneT2000 commented 1 week ago

What is the script you using to train your policy? It does seems strange joint space control does worse but pd ee delta pos should work fine as well

idonotlikemondays commented 1 week ago

What is the script you using to train your policy? It does seems strange joint space control does worse but pd ee delta pos should work fine as well

I use diffusion policy script to train the policy. And pd_joint_delta_pos control mode works fine but basically it is not available with pd_ee_delta_pos control mode. I noticed when I replay PushCube task with the pd_ee_delta_pos control mode, the conversion success rate is not good (compare to the pd_joint_delta_pos control mode, which has nearly 100% conversion success rate), I guess it might be the reason?

idonotlikemondays commented 1 week ago

PushCube replay works as intended.

StackCube demos are re-uploaded now.

Hi,

I did some experiments with the new version of StackCube demos (both pd_ee_delta_pos and pd_joint_delta_pos), but it shows poor performance.

I want to ask whether this is due to the demo itself or because the algorithm (diffusion policy in my case) is not sufficient to support this task? However, diffusion policy performs well in Peg Insertion... 🤔

image

Thanks a lot!

StoneT2000 commented 6 days ago

can you try increasing the max episode steps? the demos are kind of slow so imitating them results in a policy that takes rather long to get success.

For example stack cube max episode steps is 50 (tuned for RL) but the motion planning demos might average around 100 steps

idonotlikemondays commented 6 days ago

can you try increasing the max episode steps? the demos are kind of slow so imitating them results in a policy that takes rather long to get success.

For example stack cube max episode steps is 50 (tuned for RL) but the motion planning demos might average around 100 steps

Yes, these results are based on max_episode_steps=100. I run it with config:

total_iters: 30000
batch_size: 128 # (since 1024 is kinda slow)
max_episode_steps: 100
num_demos: 100
num_diffusion_iters: 5 # (in original code is 100, but I try 5 in other tasks and which get better performance and faster running speed)

I would try max_episode_steps=200 latter, and do you have any other idea to deal with the issue?

StoneT2000 commented 2 days ago

Actually it turns out it should be more than 100 max episode steps. Looking at the eval videos each the model does pretty well but is just slow because the demos are a bit slow. with max of 200 steps it works better

It seems I need to establish a solid set of recommended max episode steps. Especially given that most pure offline imitation learning algorithms are not meant to optimize for fast solving and instead to mimic demo distribution as much as possible, I may just recommend users to set max episode steps to about 1.5x the mean episode steps of the demonstration data and then only check the success_once evaluation metric.

StoneT2000 commented 2 days ago

Also for push cube I find no issue with training on that task. Using the example.sh script in the diffusion policy baseline folder works for me. image