Error while training a DAPG agent under "env_name=PegInsertionSide-v0"

xtli12 commented 1 year ago

Hi, when I trained a DAPG agent under "env_name=PegInsertionSide-v0" using the script:

DISPLAY="" python maniskill2_learn/apis/run_rl.py configs/mfrl/dapg/maniskill2_pn.py 
--work-dir /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main/Result/PegInsertionSide-v0/DAPG 
--gpu-ids 2 --cfg-options "env_cfg.env_name=PegInsertionSide-v0" "env_cfg.obs_mode=pointcloud" "env_cfg.n_points=1200" "env_cfg.reward_mode=dense" "rollout_cfg.num_procs=5"
 "env_cfg.control_mode=pd_ee_delta_pose" "env_cfg.obs_frame=ee" "agent_cfg.demo_replay_cfg.capacity=20000" 
"agent_cfg.demo_replay_cfg.cache_size=20000" "agent_cfg.demo_replay_cfg.dynamic_loading=True" "agent_cfg.demo_replay_cfg.num_samples=-1" 
"agent_cfg.demo_replay_cfg.buffer_filenames=/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PegInsertionSide-v0/trajectory.h5"  
"eval_cfg.num=100" "eval_cfg.save_traj=False" "eval_cfg.save_video=True"

the error came out like:

PegInsertionSide-v0-train - (collect_env.py:133) - INFO - 2023-04-22,09:52:31 - meta_collect_time: 2023-04-22-02:52:31
PegInsertionSide-v0-train - (collect_env.py:133) - INFO - 2023-04-22,09:52:31 - PYRL: version: 1.8.0b0
PegInsertionSide-v0-train - (collect_env.py:133) - INFO - 2023-04-22,09:52:31 - ManiSkill2: 
PegInsertionSide-v0-train - (run_rl.py:243) - INFO - 2023-04-22,09:52:31 - Initialize torch!
PegInsertionSide-v0-train - (run_rl.py:245) - INFO - 2023-04-22,09:52:31 - Finish Initialize torch!
PegInsertionSide-v0-train - (run_rl.py:253) - INFO - 2023-04-22,09:52:31 - Build agent!
PegInsertionSide-v0-train - (replay_buffer.py:59) - INFO - 2023-04-22,09:52:31 - Load 1 files!
PegInsertionSide-v0-train - (logger.py:155) - INFO - 2023-04-22,09:52:31 - 0%|          | 0/1 [00:00<?, ?it/s]
PegInsertionSide-v0-train - (logger.py:155) - INFO - 2023-04-22,09:52:31 - 100%|##########| 1/1 [00:00<00:00,  5.68it/s]
PegInsertionSide-v0-train - (replay_buffer.py:62) - INFO - 2023-04-22,09:52:31 - Load 1 files with 154419 samples in total!
Traceback (most recent call last):
  File "maniskill2_learn/apis/run_rl.py", line 487, in <module>
    main()
  File "maniskill2_learn/apis/run_rl.py", line 452, in main
    run_one_process(0, 1, args, cfg)
  File "maniskill2_learn/apis/run_rl.py", line 432, in run_one_process
    main_rl(rollout, evaluator, replay, args, cfg, expert_replay=expert_replay, recent_traj_replay=recent_traj_replay)
  File "maniskill2_learn/apis/run_rl.py", line 254, in main_rl
    agent = build_agent(cfg.agent_cfg)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/methods/builder.py", line 12, in build_agent
    return build_from_cfg(cfg, agent_type, default_args)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/utils/meta/registry.py", line 136, in build_from_cfg
    return obj_cls(**args)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/methods/mfrl/ppo.py", line 134, in __init__
    self.demo_replay = build_replay(demo_replay_cfg)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/env/builder.py", line 27, in build_replay
    return build_from_cfg(cfg, REPLAYS, default_args)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/utils/meta/registry.py", line 136, in build_from_cfg
    return obj_cls(**args)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/env/replay_buffer.py", line 77, in __init__
    self.file_loader = FileCache(
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/utils/file/cache_utils.py", line 450, in __init__
    self.shared_buffer = create_shared_dict_array_from_files(filenames, capacity, data_coder, keys, keys_map=keys_map)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/utils/file/cache_utils.py", line 138, in create_shared_dict_array_from_files
    item = DictArray(item)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/utils/data/dict_array.py", line 763, in __init__
    self.assert_shape(self.memory, self.capacity)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/utils/data/dict_array.py", line 802, in assert_shape
    assert cls.check_shape(memory, capacity), f"The first dimension is not {capacity}!"
AssertionError: The first dimension is not 137!
Exception ignored in: <function ReplayMemory.__del__ at 0x7fab7d260790>
Traceback (most recent call last):
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/env/replay_buffer.py", line 257, in __del__
    self.close()
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/env/replay_buffer.py", line 253, in close
    if self.file_loader is not None:
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/env/replay_buffer.py", line 154, in __getattr__
    return getattr(self.memory, key, None)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/env/replay_buffer.py", line 154, in __getattr__
    return getattr(self.memory, key, None)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/env/replay_buffer.py", line 154, in __getattr__
    return getattr(self.memory, key, None)
  [Previous line repeated 995 more times]
RecursionError: maximum recursion depth exceeded
Exception ignored in: <function SharedGDict.__del__ at 0x7fac5a2bbd30>
Traceback (most recent call last):
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/utils/data/dict_array.py", line 928, in __del__
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/utils/data/dict_array.py", line 913, in _unlink
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/multiprocessing/shared_memory.py", line 237, in unlink
ImportError: sys.meta_path is None, Python is likely shutting down
/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/multiprocessing/resource_tracker.py:203: UserWarning: resource_tracker: There appear to be 18 leaked shared_memory objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d

How can I solve it?

xuanlinli17 commented 1 year ago

agent_cfg.demo_replay_cfg.buffer_filenames=/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PegInsertionSide-v0/trajectory.h5

I think you used state demo instead of point cloud demo?

xtli12 commented 1 year ago

Oh, I'm afraid that I used the state demo. Based on your prompt, I used the replay_trajectory.py like:

python -m mani_skill2.trajectory.replay_trajectory --traj-path demos/rigid_body/PegInsertionSide-v0/trajectory.h5  
--save-traj --target-control-mode pd_ee_delta_pose --obs-mode pointcloud --num-procs 1

to convert the raw files into the desired observation and control modes, and then ran the script:

python tools/convert_state.py --env-name PegInsertionSide-v0 --num-procs 1 
--traj-name /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PegInsertionSide-v0/trajectory.pointcloud.pd_ee_delta_pose.h5 
--json-name /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PegInsertionSide-v0/trajectory.pointcloud.pd_ee_delta_pose.json 
--output-name /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PegInsertionSide-v0/pd_ee_delta_pose.h5 
--control-mode pd_ee_delta_pose --max-num-traj -1 --obs-mode pointcloud --n-points 1200 --obs-frame ee --reward-mode dense --render

to render point cloud demonstrations. But when I ran the script to train a DAPG agent under env_name=PegInsertionSide-v0:

DISPLAY="" python maniskill2_learn/apis/run_rl.py configs/mfrl/dapg/maniskill2_pn.py
--work-dir /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main/Result/PegInsertionSide-v0/dapg_pointcloud
--gpu-ids 2   --cfg-options "env_cfg.env_name=PegInsertionSide-v0" "env_cfg.obs_mode=pointcloud" "env_cfg.n_points=1200"             
"rollout_cfg.num_procs=5" "env_cfg.reward_mode=dense" "env_cfg.control_mode=pd_ee_delta_pose" "env_cfg.obs_frame=ee" 
"agent_cfg.demo_replay_cfg.capacity=20000" "agent_cfg.demo_replay_cfg.cache_size=20000" "agent_cfg.demo_replay_cfg.dynamic_loading=True" "agent_cfg.demo_replay_cfg.num_samples=-1" 
"agent_cfg.demo_replay_cfg.buffer_filenames=/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PegIPegInsertionSide-v0/pd_ee_delta_pose.h5"
"eval_cfg.num=100" "eval_cfg.save_traj=False" "eval_cfg.save_video=True" "train_cfg.total_steps=25000000" "train_cfg.n_checkpoint=5000000"

the error came out like:

PegInsertionSide-v0-train - (collect_env.py:133) - INFO - 2023-04-23,12:40:31 - meta_collect_time: 2023-04-23-05:40:31
PegInsertionSide-v0-train - (collect_env.py:133) - INFO - 2023-04-23,12:40:31 - PYRL: version: 1.8.0b0
PegInsertionSide-v0-train - (collect_env.py:133) - INFO - 2023-04-23,12:40:31 - ManiSkill2: 
PegInsertionSide-v0-train - (run_rl.py:243) - INFO - 2023-04-23,12:40:31 - Initialize torch!
PegInsertionSide-v0-train - (run_rl.py:245) - INFO - 2023-04-23,12:40:31 - Finish Initialize torch!
PegInsertionSide-v0-train - (run_rl.py:253) - INFO - 2023-04-23,12:40:31 - Build agent!
Traceback (most recent call last):
  File "maniskill2_learn/apis/run_rl.py", line 487, in <module>
    main()
  File "maniskill2_learn/apis/run_rl.py", line 452, in main
    run_one_process(0, 1, args, cfg)
  File "maniskill2_learn/apis/run_rl.py", line 432, in run_one_process
    main_rl(rollout, evaluator, replay, args, cfg, expert_replay=expert_replay, recent_traj_replay=recent_traj_replay)
  File "maniskill2_learn/apis/run_rl.py", line 254, in main_rl
    agent = build_agent(cfg.agent_cfg)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/methods/builder.py", line 12, in build_agent
    return build_from_cfg(cfg, agent_type, default_args)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/utils/meta/registry.py", line 136, in build_from_cfg
    return obj_cls(**args)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/methods/mfrl/ppo.py", line 137, in __init__
    assert key in self.demo_replay.memory, f"DAPG needs {key} in your demo!"
TypeError: argument of type 'NoneType' is not iterable
Exception ignored in: <function SharedGDict.__del__ at 0x7f6e741fdca0>
Traceback (most recent call last):
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/utils/data/dict_array.py", line 928, in __del__
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/utils/data/dict_array.py", line 913, in _unlink
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/multiprocessing/shared_memory.py", line 237, in unlink
ImportError: sys.meta_path is None, Python is likely shutting down
/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/multiprocessing/resource_tracker.py:203: UserWarning: resource_tracker: There appear to be 18 leaked shared_memory objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

How can I solve it?

xuanlinli17 commented 1 year ago

Please follow this script (https://github.com/haosulab/ManiSkill2-Learn/blob/main/scripts/example_demo_conversion/general_rigid_body_single_object_envs.sh) to convert the trajectories.

Note:

As written in ManiSkill2 documentation (https://haosulab.github.io/ManiSkill2/concepts/demonstrations.html), it is not recommended to run mani_skill2.trajectory.replay_trajectory with visual observation mode. This is because the visual observation is not post-processed, consumes lots of space, and cannot be directly used for downstream learning frameworks such as ManiSkill2-Learn, which causes your error.
--render in ManiSkill2-Learn tools/convert_state.py is not meant to render point cloud demonstrations. It is meant to show the demonstration on the monitor for visualization. It only works for --num-procs=1, and when you parallelize demo conversion by setting higher --num-procs, you should remove this argument.

xtli12 commented 1 year ago

Hi, with your kind assistance, I was able to solve the DAPG training issues for env_name=PegInsertionSide-v0 、AssemblingKits-v0、 PlugCharger-v0. Thank you very much! However, when I trained DAPG under env_name=PushChair-v1, I found that there are many folders within thePushChair-v1 directory containing .h5 file, such as: and when I followed the script (https://github.com/haosulab/ManiSkill2-Learn/blob/main/scripts/example_demo_conversion/general_rigid_body_multi_object_envs.sh) to merge the .h5 files by using:

    ENV="PushChair-v1"
    python -m mani_skill2.trajectory.merge_trajectory \
    -i /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/$ENV/ \
    -o /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/$ENV/trajectory_merged.h5 \
    -p trajectory.h5

the error came out like (the above script worked fine with the env_name=TurnFaucet-v0, and could merge the .h5 files successfully)

(sapien) lxt21@ubuntu:~/SAPIEN-master/ManiSkill2-Learn-main-old/scripts$ bash test_general_rigid_body_multi_object_envs.sh 
Merge to /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PushChair-v1/trajectory_merged.h5
Merging /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PushChair-v1/3001/trajectory.h5
Merging /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PushChair-v1/3003/trajectory.h5
Traceback (most recent call last):
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/runpy.py", line 192, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/trajectory/merge_trajectory.py", line 81, in <module>
    main()
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/trajectory/merge_trajectory.py", line 77, in main
    merge_h5(args.output_path, traj_paths)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/trajectory/merge_trajectory.py", line 32, in merge_h5
    assert str(env_info) == str(_env_info), traj_path
AssertionError: /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PushChair-v1/3003/trajectory.h5

I'm unsure whether I should merge the h5 files in the folder PushChair-v1 or just pick one folder, so I pick one folder like3003 in the folder PushChair-v1 to convert the .h5 file into the control_mode=base_pd_joint_vel_arm_pd_joint_delta_pos using the script:

python -m mani_skill2.trajectory.replay_trajectory --traj-path demos/rigid_body/PushChair-v1/3003/trajectory.h5   --save-traj --target-control-mode base_pd_joint_vel_arm_pd_joint_delta_pos --obs-mode none --num-procs 16

It seems that all the Episodes have been skipped:

Episode 225 is not replayed successfully. Skipping                                                                
Episode 245 is not replayed successfully. Skippingl_mode=base_pd_joint_vel_arm_pd_joint_delta_pos, obs_mode=none] 
Episode 150 is not replayed successfully. Skipping                                                                
Episode 169 is not replayed successfully. Skipping                                                               ]
Episode 207 is not replayed successfully. Skippingol_mode=base_pd_joint_vel_arm_pd_joint_delta_pos, obs_mode=none]
Episode 55 is not replayed successfully. Skipping
Episode 263 is not replayed successfully. Skipping
Episode 37 is not replayed successfully. Skipping                                                                 
Episode 281 is not replayed successfully. Skipping                                                               ]
Episode 132 is not replayed successfully. Skipping                                                                
Episode 299 is not replayed successfully. Skippingol_mode=base_pd_joint_vel_arm_pd_joint_delta_pos, obs_mode=none]
Episode 189 is not replayed successfully. Skipping                                                                
Episode 151 is not replayed successfully. Skippingol_mode=base_pd_joint_vel_arm_pd_joint_delta_pos, obs_mode=none]
Episode 170 is not replayed successfully. Skipping                                                                
Episode 226 is not replayed successfully. Skipping                                                                
Episode 208 is not replayed successfully. Skipping                                                                
Episode 56 is not replayed successfully. Skippingrol_mode=base_pd_joint_vel_arm_pd_joint_delta_pos, obs_mode=none]

So I only can convert the .h5 files into the desired observation (pointcloud) by using the script:

python tools/convert_state.py --env-name PushChair-v1 --num-procs 32 
--traj-name /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PushChair-v1/3003/trajectory.h5 
--json-name /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PushChair-v1/3003/trajectory.json --output-name /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PushChair-v1/3003/convert.h5 
--max-num-traj -1 --obs-mode pointcloud --n-points 1200 --obs-frame ee --reward-mode dense --control-mode base_pd_joint_vel_arm_pd_joint_vel

but when I trained the dapg agent by using the script:

DISPLAY="" python maniskill2_learn/apis/run_rl.py configs/mfrl/dapg/maniskill2_dapg_PegInsertionSide-v0.py 
--work-dir ../ManiSkill2-Learn-main/Result/PushChair-v1/dapg_pointcloud --gpu-ids 3 
--cfg-options "env_cfg.env_name=PushChair-v1" "env_cfg.obs_mode=pointcloud" "env_cfg.n_points=1200" 
"rollout_cfg.num_procs=20" "env_cfg.reward_mode=dense" "env_cfg.control_mode=base_pd_joint_vel_arm_pd_joint_vel" 
"env_cfg.obs_frame=ee" "agent_cfg.demo_replay_cfg.capacity=20000" "agent_cfg.demo_replay_cfg.cache_size=20000" 
"agent_cfg.demo_replay_cfg.dynamic_loading=True" "agent_cfg.demo_replay_cfg.num_samples=-1" 
"agent_cfg.demo_replay_cfg.buffer_filenames=/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PushChair-v1/3003/convert.h5" 
"eval_cfg.num=100" "eval_cfg.save_traj=False" "eval_cfg.save_video=True"

the error came out like :

PushChair-v1-train - (ppo.py:364) - INFO - 2023-04-25,10:04:53 - Number of batches in one PPO epoch: 61!
PushChair-v1-train - (ppo.py:451) - INFO - 2023-04-25,10:04:55 - **Warming up critic at the beginning of training; this causes reported ETA to be slower than actual ETA**
PushChair-v1-train - (train_rl.py:371) - INFO - 2023-04-25,10:05:31 - 20000/50000000000(0%) Passed time:8m ETA:38y1m23d5h55m53s samples_stats: rewards:0.0[0.0, 0.0], max_single_R:0.00[0.00, 0.00], lens:200[200, 200], success:0.00 gpu_mem_ratio: 72.7% gpu_mem: 28.80G gpu_mem_this: 17.80G gpu_util: 8% old_log_p: -8.232 adv_mean: -2.404e-04 adv_std: 7.275e-05 max_normed_adv: 4.868 v_target: 1.427 ori_returns: 2.515e-04 grad_norm: 14.165 clipped_grad_norm: 0.429 critic_loss: 0.653 critic_mse: 1.306 critic_err: 2.786e-02 policy_std: 0.368 entropy: 8.395 mean_p_ratio: 0.998 max_p_ratio: 4.533 log_p: -8.289 clip_frac: 0.441 approx_kl: 0.113 actor_loss: 6.380e-02 entropy_loss: 0 demo_nll_loss: 31.084 demo_actor_loss: 3.108 visual_grad: 0.101 actor_mlp_grad: 4.924 critic_mlp_grad: 1.799 max_policy_abs: 1.018 policy_norm: 34.512 max_critic_abs: 1.018 critic_norm: 33.367 num_actor_epoch: 1.000 demo_lambda: 0.100 episode_time: 460.445 collect_sample_time: 424.485 memory: 82.94G
PushChair-v1-train - (rollout.py:117) - INFO - 2023-04-25,10:13:18 - Finish with 20000 samples, simulation time/FPS:440.29/45.42, agent time/FPS:4.53/4416.83, overhead time:5.36
PushChair-v1-train - (ppo.py:364) - INFO - 2023-04-25,10:13:18 - Number of batches in one PPO epoch: 61!
Traceback (most recent call last):
  File "maniskill2_learn/apis/run_rl.py", line 487, in <module>
    main()
  File "maniskill2_learn/apis/run_rl.py", line 452, in main
    run_one_process(0, 1, args, cfg)
  File "maniskill2_learn/apis/run_rl.py", line 432, in run_one_process
    main_rl(rollout, evaluator, replay, args, cfg, expert_replay=expert_replay, recent_traj_replay=recent_traj_replay)
  File "maniskill2_learn/apis/run_rl.py", line 293, in main_rl
    train_rl(
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/apis/train_rl.py", line 300, in train_rl
    training_infos = agent.update_parameters(replay, updates=total_updates, **extra_args)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/methods/mfrl/ppo.py", line 392, in update_parameters
    demo_memory["dones"] = demo_memory["dones"] * 0
TypeError: 'NoneType' object is not subscriptable
Exception ignored in: <function SharedGDict.__del__ at 0x7f854ad55ca0>
Traceback (most recent call last):
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/utils/data/dict_array.py", line 928, in __del__
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/utils/data/dict_array.py", line 913, in _unlink
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/multiprocessing/shared_memory.py", line 237, in unlink
ImportError: sys.meta_path is None, Python is likely shutting down
/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/multiprocessing/resource_tracker.py:203: UserWarning: resource_tracker: There appear to be 39 leaked shared_memory objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

even when I converted the .h5 file into the reward_mode=sparse it still didn't work, so how can I solve those problems?

xuanlinli17 commented 1 year ago

The cause is that trajectories are not replayed successfully.

Please follow https://github.com/haosulab/ManiSkill2-Learn/blob/main/scripts/example_demo_conversion/maniskill1.sh to convert ManiSkill1 trajectories, instead of general_rigid_body_single_object_envs.sh. I'll modify general_rigid_body_single_object_envs.sh to clear up the confusion.

There is a note in our documentation on converting such demonstrations. You need to use --use-env-states to convert maniskill1 trajectories. In addition, you have to use the exact same controller (base_pd_joint_vel_arm_pd_joint_vel) as the demo, and conversion to other controllers will cause replay failures. This is because the environment is not quasi-static.

xtli12 commented 1 year ago

Hi, when I followed the https://github.com/haosulab/ManiSkill2-Learn/blob/main/scripts/example_demo_conversion/maniskill1.sh to convert ManiSkill1 trajectories, error came out like:

Traceback (most recent call last):
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/multiprocessing/process.py", line 313, in _bootstrap
    self.run()
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/tools/convert_state.py", line 83, in convert_state_representation
    env.reset(**reset_kwargs[cur_episode_num])
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/maniskill2_learn/env/wrappers.py", line 97, in reset
    obs = self.env.reset(*args, **kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/gym/wrappers/time_limit.py", line 27, in reset
    return self.env.reset(**kwargs)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/maniskill2_learn/env/wrappers.py", line 222, in reset
    return self.observation(obs)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/maniskill2_learn/env/wrappers.py", line 381, in observation
    pose = observation["extra"]["tcp_pose"]
KeyError: 'tcp_pose'

xuanlinli17 commented 1 year ago

Are you using the latest ManiSkill2-learn?

xtli12 commented 1 year ago

Oh, I mistake the old version of ManiSkill2-learn, now I can get the trajectory.none.base_pd_joint_vel_arm_pd_joint_vel_pointcloud.h5 but when I run:

DISPLAY="" python maniskill2_learn/apis/run_rl.py configs/mfrl/dapg/maniskill2_pn.py 
--work-dir ../ManiSkill2-Learn-main/Result/PushChair-v1/dapg_pointcloud --gpu-ids 3 
--cfg-options "env_cfg.env_name=MoveBucket-v1" "env_cfg.obs_mode=pointcloud" "env_cfg.n_points=1200" "rollout_cfg.num_procs=5" "env_cfg.reward_mode=dense" "env_cfg.control_mode=base_pd_joint_vel_arm_pd_joint_vel"
"env_cfg.obs_frame=ee" "agent_cfg.demo_replay_cfg.capacity=20000" "agent_cfg.demo_replay_cfg.cache_size=20000"
"agent_cfg.demo_replay_cfg.dynamic_loading=True" "agent_cfg.demo_replay_cfg.num_samples=-1"
"agent_cfg.demo_replay_cfg.buffer_filenames=/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/MoveBucket-v1/3005/trajectory.none.base_pd_joint_vel_arm_pd_joint_vel_pointcloud.h5"
"eval_cfg.num=100" "eval_cfg.save_traj=False" "eval_cfg.save_video=True"

the new error came like:

Traceback (most recent call last):
  File "maniskill2_learn/apis/run_rl.py", line 487, in <module>
    main()
  File "maniskill2_learn/apis/run_rl.py", line 452, in main
    run_one_process(0, 1, args, cfg)
  File "maniskill2_learn/apis/run_rl.py", line 432, in run_one_process
    main_rl(rollout, evaluator, replay, args, cfg, expert_replay=expert_replay, recent_traj_replay=recent_traj_replay)
  File "maniskill2_learn/apis/run_rl.py", line 254, in main_rl
    agent = build_agent(cfg.agent_cfg)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/methods/builder.py", line 12, in build_agent
    return build_from_cfg(cfg, agent_type, default_args)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/utils/meta/registry.py", line 136, in build_from_cfg
    return obj_cls(**args)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/methods/mfrl/ppo.py", line 137, in __init__
    assert key in self.demo_replay.memory, f"DAPG needs {key} in your demo!"
TypeError: argument of type 'NoneType' is not iterable

xuanlinli17 commented 1 year ago

Are your demo path and env name correct? You save as PushChair but use MoveBucket env name and demo path.

xtli12 commented 1 year ago

Oh, my apologies for my carelessness. Although I have corrected the demo path and env name, the previous error still exists.

PushChair-v1-train - (run_rl.py:258) - INFO - 2023-05-05,06:51:26 - Num of parameters: 0.48M, Model Size: 1.90M
PushChair-v1-train - (run_rl.py:286) - INFO - 2023-05-05,06:51:33 - Work directory of this run ../ManiSkill2-Learn-main/Result/PushChair-v1/dapg_pointcloud
PushChair-v1-train - (run_rl.py:288) - INFO - 2023-05-05,06:51:33 - Train over GPU [3]!
PushChair-v1-train - (train_rl.py:180) - INFO - 2023-05-05,06:51:47 - Rollout state dim: {'xyz': (5, 1200, 3), 'rgb': (5, 1200, 3), 'frame_related_states': (5, 3, 3), 'to_frames': (5, 3, 4, 4), 'state': (5, 47)}, action dim: (5, 20)!
PushChair-v1-train - (train_rl.py:241) - INFO - 2023-05-05,06:51:47 - Begin training!
PushChair-v1-train - (rollout.py:117) - INFO - 2023-05-05,07:21:24 - Finish with 20000 samples, simulation time/FPS:1718.08/11.64, agent time/FPS:28.96/690.56, overhead time:10.26
PushChair-v1-train - (train_rl.py:282) - INFO - 2023-05-05,07:21:24 - Replay buffer shape: {'obs': {'xyz': (20000, 1200, 3), 'rgb': (20000, 1200, 3), 'frame_related_states': (20000, 3, 3), 'to_frames': (20000, 3, 4, 4), 'state': (20000, 47)}, 'next_obs': {'xyz': (20000, 1200, 3), 'rgb': (20000, 1200, 3), 'frame_related_states': (20000, 3, 3), 'to_frames': (20000, 3, 4, 4), 'state': (20000, 47)}, 'actions': (20000, 20), 'rewards': (20000, 1), 'dones': (20000, 1), 'episode_dones': (20000, 1), 'infos': {'elapsed_steps': (20000, 1), 'success': (20000, 1), 'chair_close_to_target': (20000, 1), 'chair_standing': (20000, 1), 'chair_static': (20000, 1), 'dist_chair_to_target': (20000, 1), 'chair_tilt': (20000, 1), 'chair_vel_norm': (20000, 1), 'chair_ang_vel_norm': (20000, 1), 'dist_ee_to_chair': (20000, 1), 'action_norm': (20000, 1), 'cos_chair_vel_to_target': (20000, 1), 'stage_reward': (20000, 1), 'reward': (20000, 1), 'TimeLimit.truncated': (20000, 1)}, 'worker_indices': (20000, 1), 'is_truncated': (20000, 1)}.
PushChair-v1-train - (ppo.py:364) - INFO - 2023-05-05,07:21:25 - Number of batches in one PPO epoch: 61!
PushChair-v1-train - (ppo.py:451) - INFO - 2023-05-05,07:21:26 - **Warming up critic at the beginning of training; this causes reported ETA to be slower than actual ETA**
PushChair-v1-train - (train_rl.py:371) - INFO - 2023-05-05,07:22:07 - 20000/5000000(0%) Passed time:30m20s ETA:5d5h53m10s samples_stats: rewards:-2088.6[-2213.9, -1854.6], max_single_R:-9.46[-10.30, -5.12], lens:200[200, 200], success:0.00 gpu_mem_ratio: 89.5% gpu_mem: 35.43G gpu_mem_this: 11.40G gpu_util: 8% old_log_p: -8.239 adv_mean: -41.156 adv_std: 8.160 max_normed_adv: 6.999 v_target: -5.543 ori_returns: -166.138 grad_norm: 20.886 clipped_grad_norm: 0.486 critic_loss: 1.615 critic_mse: 3.229 critic_err: 2.891e-02 policy_std: 0.368 entropy: 8.392 mean_p_ratio: 0.983 max_p_ratio: 3.581 log_p: -8.453 clip_frac: 0.478 approx_kl: 0.156 actor_loss: 2.583e-02 entropy_loss: 0 demo_nll_loss: 26.356 demo_actor_loss: 2.636 visual_grad: 0.449 actor_mlp_grad: 4.375 critic_mlp_grad: 6.330 max_policy_abs: 1.029 policy_norm: 34.622 max_critic_abs: 1.029 critic_norm: 33.534 num_actor_epoch: 1.000 demo_lambda: 0.100 episode_time: 1800.168 collect_sample_time: 1757.711 memory: 10.55G
PushChair-v1-train - (rollout.py:117) - INFO - 2023-05-05,07:53:53 - Finish with 20000 samples, simulation time/FPS:1855.60/10.78, agent time/FPS:23.93/835.70, overhead time:9.30
PushChair-v1-train - (ppo.py:364) - INFO - 2023-05-05,07:53:53 - Number of batches in one PPO epoch: 61!
Traceback (most recent call last):
  File "maniskill2_learn/apis/run_rl.py", line 487, in <module>
    main()
  File "maniskill2_learn/apis/run_rl.py", line 452, in main
    run_one_process(0, 1, args, cfg)
  File "maniskill2_learn/apis/run_rl.py", line 432, in run_one_process
    main_rl(rollout, evaluator, replay, args, cfg, expert_replay=expert_replay, recent_traj_replay=recent_traj_replay)
  File "maniskill2_learn/apis/run_rl.py", line 293, in main_rl
    train_rl(
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/apis/train_rl.py", line 300, in train_rl
    training_infos = agent.update_parameters(replay, updates=total_updates, **extra_args)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/methods/mfrl/ppo.py", line 392, in update_parameters
    demo_memory["dones"] = demo_memory["dones"] * 0
TypeError: 'NoneType' object is not subscriptable

xuanlinli17 commented 1 year ago

The environment shouldn't run this slow... Did you check the cpu usage on your machine?

Also currently to fix the issue, set the cache_size and capacity of the demo replay buffer to be smaller or equal to the total number of observations in the demo.

xtli12 commented 1 year ago

Thanks to your kind assistance, the problem had been solved! As for the issue of running too slow, I had cleaned up my GPU and set the rollout=20, but it still seems too slow:

(train_rl.py:371) - INFO - 2023-05-07,01:31:59 - 20000/5000000(0%) Passed time:10m19s ETA:1d18h50m53s samples_stats: rewards:-1711.6[-2268.0, -690.9], max_single_R:-6.05[-8.68, -1.98], lens:200[200, 200], success:0.06 gpu_mem_ratio: 59.1% gpu_mem: 23.42G gpu_mem_this: 17.92G gpu_util: 2% old_log_p: -8.272 adv_mean: -3.233 adv_std: 15.937 max_normed_adv: 4.263 v_target: -7.672 ori_returns: -168.904 grad_norm: 11.013 clipped_grad_norm: 0.500 critic_loss: 0.275 critic_mse: 0.551 critic_err: 0.409 policy_std: 0.397 entropy: 9.819 mean_p_ratio: 1.068 max_p_ratio: 10.066 log_p: -8.528 clip_frac: 0.723 approx_kl: 0.298 actor_loss: 0.111 entropy_loss: 0 demo_nll_loss: 17.095 demo_actor_loss: 1.331 visual_grad: 4.738 actor_mlp_grad: 8.533 critic_mlp_grad: 5.445 max_policy_abs: 1.149 policy_norm: 36.183 max_critic_abs: 1.095 critic_norm: 34.873 num_actor_epoch: 1.000 demo_lambda: 7.783e-02 episode_time: 604.302 collect_sample_time: 566.976 memory: 23.34G

What can I do to speed it up?

xuanlinli17 commented 1 year ago

set rollout_cfg.num_procs not too large (otherwise cpu load is too much but #cpu is not enough) and not too small

xtli12 commented 1 year ago

OK， thank you for your kind assistance!

haosulab / ManiSkill

Error while training a DAPG agent under "env_name=PegInsertionSide-v0" #81