Closed xtli12 closed 1 year ago
agent_cfg.demo_replay_cfg.buffer_filenames=/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PegInsertionSide-v0/trajectory.h5
I think you used state demo instead of point cloud demo?
Oh, I'm afraid that I used the state demo. Based on your prompt, I used the replay_trajectory.py
like:
python -m mani_skill2.trajectory.replay_trajectory --traj-path demos/rigid_body/PegInsertionSide-v0/trajectory.h5
--save-traj --target-control-mode pd_ee_delta_pose --obs-mode pointcloud --num-procs 1
to convert the raw files into the desired observation and control modes, and then ran the script:
python tools/convert_state.py --env-name PegInsertionSide-v0 --num-procs 1
--traj-name /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PegInsertionSide-v0/trajectory.pointcloud.pd_ee_delta_pose.h5
--json-name /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PegInsertionSide-v0/trajectory.pointcloud.pd_ee_delta_pose.json
--output-name /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PegInsertionSide-v0/pd_ee_delta_pose.h5
--control-mode pd_ee_delta_pose --max-num-traj -1 --obs-mode pointcloud --n-points 1200 --obs-frame ee --reward-mode dense --render
to render point cloud demonstrations. But when I ran the script to train a DAPG agent under env_name=PegInsertionSide-v0
:
DISPLAY="" python maniskill2_learn/apis/run_rl.py configs/mfrl/dapg/maniskill2_pn.py
--work-dir /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main/Result/PegInsertionSide-v0/dapg_pointcloud
--gpu-ids 2 --cfg-options "env_cfg.env_name=PegInsertionSide-v0" "env_cfg.obs_mode=pointcloud" "env_cfg.n_points=1200"
"rollout_cfg.num_procs=5" "env_cfg.reward_mode=dense" "env_cfg.control_mode=pd_ee_delta_pose" "env_cfg.obs_frame=ee"
"agent_cfg.demo_replay_cfg.capacity=20000" "agent_cfg.demo_replay_cfg.cache_size=20000" "agent_cfg.demo_replay_cfg.dynamic_loading=True" "agent_cfg.demo_replay_cfg.num_samples=-1"
"agent_cfg.demo_replay_cfg.buffer_filenames=/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PegIPegInsertionSide-v0/pd_ee_delta_pose.h5"
"eval_cfg.num=100" "eval_cfg.save_traj=False" "eval_cfg.save_video=True" "train_cfg.total_steps=25000000" "train_cfg.n_checkpoint=5000000"
the error came out like:
PegInsertionSide-v0-train - (collect_env.py:133) - INFO - 2023-04-23,12:40:31 - meta_collect_time: 2023-04-23-05:40:31
PegInsertionSide-v0-train - (collect_env.py:133) - INFO - 2023-04-23,12:40:31 - PYRL: version: 1.8.0b0
PegInsertionSide-v0-train - (collect_env.py:133) - INFO - 2023-04-23,12:40:31 - ManiSkill2:
PegInsertionSide-v0-train - (run_rl.py:243) - INFO - 2023-04-23,12:40:31 - Initialize torch!
PegInsertionSide-v0-train - (run_rl.py:245) - INFO - 2023-04-23,12:40:31 - Finish Initialize torch!
PegInsertionSide-v0-train - (run_rl.py:253) - INFO - 2023-04-23,12:40:31 - Build agent!
Traceback (most recent call last):
File "maniskill2_learn/apis/run_rl.py", line 487, in <module>
main()
File "maniskill2_learn/apis/run_rl.py", line 452, in main
run_one_process(0, 1, args, cfg)
File "maniskill2_learn/apis/run_rl.py", line 432, in run_one_process
main_rl(rollout, evaluator, replay, args, cfg, expert_replay=expert_replay, recent_traj_replay=recent_traj_replay)
File "maniskill2_learn/apis/run_rl.py", line 254, in main_rl
agent = build_agent(cfg.agent_cfg)
File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/methods/builder.py", line 12, in build_agent
return build_from_cfg(cfg, agent_type, default_args)
File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/utils/meta/registry.py", line 136, in build_from_cfg
return obj_cls(**args)
File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/methods/mfrl/ppo.py", line 137, in __init__
assert key in self.demo_replay.memory, f"DAPG needs {key} in your demo!"
TypeError: argument of type 'NoneType' is not iterable
Exception ignored in: <function SharedGDict.__del__ at 0x7f6e741fdca0>
Traceback (most recent call last):
File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/utils/data/dict_array.py", line 928, in __del__
File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/utils/data/dict_array.py", line 913, in _unlink
File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/multiprocessing/shared_memory.py", line 237, in unlink
ImportError: sys.meta_path is None, Python is likely shutting down
/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/multiprocessing/resource_tracker.py:203: UserWarning: resource_tracker: There appear to be 18 leaked shared_memory objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
How can I solve it?
Please follow this script (https://github.com/haosulab/ManiSkill2-Learn/blob/main/scripts/example_demo_conversion/general_rigid_body_single_object_envs.sh) to convert the trajectories.
Note:
mani_skill2.trajectory.replay_trajectory
with visual observation mode. This is because the visual observation is not post-processed, consumes lots of space, and cannot be directly used for downstream learning frameworks such as ManiSkill2-Learn, which causes your error.--render
in ManiSkill2-Learn tools/convert_state.py
is not meant to render point cloud demonstrations. It is meant to show the demonstration on the monitor for visualization. It only works for --num-procs=1
, and when you parallelize demo conversion by setting higher --num-procs
, you should remove this argument.Hi, with your kind assistance, I was able to solve the DAPG training issues for env_name=PegInsertionSide-v0 、AssemblingKits-v0、 PlugCharger-v0
. Thank you very much! However, when I trained DAPG under env_name=PushChair-v1
, I found that there are many folders within thePushChair-v1
directory containing .h5
file, such as:
and when I followed the script (https://github.com/haosulab/ManiSkill2-Learn/blob/main/scripts/example_demo_conversion/general_rigid_body_multi_object_envs.sh) to merge the .h5
files by using:
ENV="PushChair-v1"
python -m mani_skill2.trajectory.merge_trajectory \
-i /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/$ENV/ \
-o /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/$ENV/trajectory_merged.h5 \
-p trajectory.h5
the error came out like (the above script worked fine with the env_name=TurnFaucet-v0
, and could merge the .h5
files successfully)
(sapien) lxt21@ubuntu:~/SAPIEN-master/ManiSkill2-Learn-main-old/scripts$ bash test_general_rigid_body_multi_object_envs.sh
Merge to /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PushChair-v1/trajectory_merged.h5
Merging /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PushChair-v1/3001/trajectory.h5
Merging /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PushChair-v1/3003/trajectory.h5
Traceback (most recent call last):
File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/runpy.py", line 192, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/trajectory/merge_trajectory.py", line 81, in <module>
main()
File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/trajectory/merge_trajectory.py", line 77, in main
merge_h5(args.output_path, traj_paths)
File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/trajectory/merge_trajectory.py", line 32, in merge_h5
assert str(env_info) == str(_env_info), traj_path
AssertionError: /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PushChair-v1/3003/trajectory.h5
I'm unsure whether I should merge the h5
files in the folder PushChair-v1
or just pick one folder, so I pick one folder like3003
in the folder PushChair-v1
to convert the .h5
file into the control_mode=base_pd_joint_vel_arm_pd_joint_delta_pos
using the script:
python -m mani_skill2.trajectory.replay_trajectory --traj-path demos/rigid_body/PushChair-v1/3003/trajectory.h5 --save-traj --target-control-mode base_pd_joint_vel_arm_pd_joint_delta_pos --obs-mode none --num-procs 16
It seems that all the Episodes have been skipped:
Episode 225 is not replayed successfully. Skipping
Episode 245 is not replayed successfully. Skippingl_mode=base_pd_joint_vel_arm_pd_joint_delta_pos, obs_mode=none]
Episode 150 is not replayed successfully. Skipping
Episode 169 is not replayed successfully. Skipping ]
Episode 207 is not replayed successfully. Skippingol_mode=base_pd_joint_vel_arm_pd_joint_delta_pos, obs_mode=none]
Episode 55 is not replayed successfully. Skipping
Episode 263 is not replayed successfully. Skipping
Episode 37 is not replayed successfully. Skipping
Episode 281 is not replayed successfully. Skipping ]
Episode 132 is not replayed successfully. Skipping
Episode 299 is not replayed successfully. Skippingol_mode=base_pd_joint_vel_arm_pd_joint_delta_pos, obs_mode=none]
Episode 189 is not replayed successfully. Skipping
Episode 151 is not replayed successfully. Skippingol_mode=base_pd_joint_vel_arm_pd_joint_delta_pos, obs_mode=none]
Episode 170 is not replayed successfully. Skipping
Episode 226 is not replayed successfully. Skipping
Episode 208 is not replayed successfully. Skipping
Episode 56 is not replayed successfully. Skippingrol_mode=base_pd_joint_vel_arm_pd_joint_delta_pos, obs_mode=none]
So I only can convert the .h5
files into the desired observation (pointcloud) by using the script:
python tools/convert_state.py --env-name PushChair-v1 --num-procs 32
--traj-name /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PushChair-v1/3003/trajectory.h5
--json-name /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PushChair-v1/3003/trajectory.json --output-name /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PushChair-v1/3003/convert.h5
--max-num-traj -1 --obs-mode pointcloud --n-points 1200 --obs-frame ee --reward-mode dense --control-mode base_pd_joint_vel_arm_pd_joint_vel
but when I trained the dapg agent by using the script:
DISPLAY="" python maniskill2_learn/apis/run_rl.py configs/mfrl/dapg/maniskill2_dapg_PegInsertionSide-v0.py
--work-dir ../ManiSkill2-Learn-main/Result/PushChair-v1/dapg_pointcloud --gpu-ids 3
--cfg-options "env_cfg.env_name=PushChair-v1" "env_cfg.obs_mode=pointcloud" "env_cfg.n_points=1200"
"rollout_cfg.num_procs=20" "env_cfg.reward_mode=dense" "env_cfg.control_mode=base_pd_joint_vel_arm_pd_joint_vel"
"env_cfg.obs_frame=ee" "agent_cfg.demo_replay_cfg.capacity=20000" "agent_cfg.demo_replay_cfg.cache_size=20000"
"agent_cfg.demo_replay_cfg.dynamic_loading=True" "agent_cfg.demo_replay_cfg.num_samples=-1"
"agent_cfg.demo_replay_cfg.buffer_filenames=/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/PushChair-v1/3003/convert.h5"
"eval_cfg.num=100" "eval_cfg.save_traj=False" "eval_cfg.save_video=True"
the error came out like :
PushChair-v1-train - (ppo.py:364) - INFO - 2023-04-25,10:04:53 - Number of batches in one PPO epoch: 61!
PushChair-v1-train - (ppo.py:451) - INFO - 2023-04-25,10:04:55 - **Warming up critic at the beginning of training; this causes reported ETA to be slower than actual ETA**
PushChair-v1-train - (train_rl.py:371) - INFO - 2023-04-25,10:05:31 - 20000/50000000000(0%) Passed time:8m ETA:38y1m23d5h55m53s samples_stats: rewards:0.0[0.0, 0.0], max_single_R:0.00[0.00, 0.00], lens:200[200, 200], success:0.00 gpu_mem_ratio: 72.7% gpu_mem: 28.80G gpu_mem_this: 17.80G gpu_util: 8% old_log_p: -8.232 adv_mean: -2.404e-04 adv_std: 7.275e-05 max_normed_adv: 4.868 v_target: 1.427 ori_returns: 2.515e-04 grad_norm: 14.165 clipped_grad_norm: 0.429 critic_loss: 0.653 critic_mse: 1.306 critic_err: 2.786e-02 policy_std: 0.368 entropy: 8.395 mean_p_ratio: 0.998 max_p_ratio: 4.533 log_p: -8.289 clip_frac: 0.441 approx_kl: 0.113 actor_loss: 6.380e-02 entropy_loss: 0 demo_nll_loss: 31.084 demo_actor_loss: 3.108 visual_grad: 0.101 actor_mlp_grad: 4.924 critic_mlp_grad: 1.799 max_policy_abs: 1.018 policy_norm: 34.512 max_critic_abs: 1.018 critic_norm: 33.367 num_actor_epoch: 1.000 demo_lambda: 0.100 episode_time: 460.445 collect_sample_time: 424.485 memory: 82.94G
PushChair-v1-train - (rollout.py:117) - INFO - 2023-04-25,10:13:18 - Finish with 20000 samples, simulation time/FPS:440.29/45.42, agent time/FPS:4.53/4416.83, overhead time:5.36
PushChair-v1-train - (ppo.py:364) - INFO - 2023-04-25,10:13:18 - Number of batches in one PPO epoch: 61!
Traceback (most recent call last):
File "maniskill2_learn/apis/run_rl.py", line 487, in <module>
main()
File "maniskill2_learn/apis/run_rl.py", line 452, in main
run_one_process(0, 1, args, cfg)
File "maniskill2_learn/apis/run_rl.py", line 432, in run_one_process
main_rl(rollout, evaluator, replay, args, cfg, expert_replay=expert_replay, recent_traj_replay=recent_traj_replay)
File "maniskill2_learn/apis/run_rl.py", line 293, in main_rl
train_rl(
File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/apis/train_rl.py", line 300, in train_rl
training_infos = agent.update_parameters(replay, updates=total_updates, **extra_args)
File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/methods/mfrl/ppo.py", line 392, in update_parameters
demo_memory["dones"] = demo_memory["dones"] * 0
TypeError: 'NoneType' object is not subscriptable
Exception ignored in: <function SharedGDict.__del__ at 0x7f854ad55ca0>
Traceback (most recent call last):
File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/utils/data/dict_array.py", line 928, in __del__
File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/utils/data/dict_array.py", line 913, in _unlink
File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/multiprocessing/shared_memory.py", line 237, in unlink
ImportError: sys.meta_path is None, Python is likely shutting down
/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/multiprocessing/resource_tracker.py:203: UserWarning: resource_tracker: There appear to be 39 leaked shared_memory objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
even when I converted the .h5
file into the reward_mode=sparse
it still didn't work, so how can I solve those problems?
The cause is that trajectories are not replayed successfully.
Please follow https://github.com/haosulab/ManiSkill2-Learn/blob/main/scripts/example_demo_conversion/maniskill1.sh to convert ManiSkill1 trajectories, instead of general_rigid_body_single_object_envs.sh
. I'll modify general_rigid_body_single_object_envs.sh
to clear up the confusion.
There is a note in our documentation on converting such demonstrations. You need to use --use-env-states
to convert maniskill1 trajectories. In addition, you have to use the exact same controller (base_pd_joint_vel_arm_pd_joint_vel
) as the demo, and conversion to other controllers will cause replay failures. This is because the environment is not quasi-static.
Hi, when I followed the https://github.com/haosulab/ManiSkill2-Learn/blob/main/scripts/example_demo_conversion/maniskill1.sh to convert ManiSkill1 trajectories, error came out like:
Traceback (most recent call last):
File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/multiprocessing/process.py", line 313, in _bootstrap
self.run()
File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/tools/convert_state.py", line 83, in convert_state_representation
env.reset(**reset_kwargs[cur_episode_num])
File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/maniskill2_learn/env/wrappers.py", line 97, in reset
obs = self.env.reset(*args, **kwargs)
File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/gym/wrappers/time_limit.py", line 27, in reset
return self.env.reset(**kwargs)
File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/maniskill2_learn/env/wrappers.py", line 222, in reset
return self.observation(obs)
File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/maniskill2_learn/env/wrappers.py", line 381, in observation
pose = observation["extra"]["tcp_pose"]
KeyError: 'tcp_pose'
Are you using the latest ManiSkill2-learn?
Oh, I mistake the old version of ManiSkill2-learn, now I can get the trajectory.none.base_pd_joint_vel_arm_pd_joint_vel_pointcloud.h5
but when I run:
DISPLAY="" python maniskill2_learn/apis/run_rl.py configs/mfrl/dapg/maniskill2_pn.py
--work-dir ../ManiSkill2-Learn-main/Result/PushChair-v1/dapg_pointcloud --gpu-ids 3
--cfg-options "env_cfg.env_name=MoveBucket-v1" "env_cfg.obs_mode=pointcloud" "env_cfg.n_points=1200" "rollout_cfg.num_procs=5" "env_cfg.reward_mode=dense" "env_cfg.control_mode=base_pd_joint_vel_arm_pd_joint_vel"
"env_cfg.obs_frame=ee" "agent_cfg.demo_replay_cfg.capacity=20000" "agent_cfg.demo_replay_cfg.cache_size=20000"
"agent_cfg.demo_replay_cfg.dynamic_loading=True" "agent_cfg.demo_replay_cfg.num_samples=-1"
"agent_cfg.demo_replay_cfg.buffer_filenames=/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main-old/demos/rigid_body/MoveBucket-v1/3005/trajectory.none.base_pd_joint_vel_arm_pd_joint_vel_pointcloud.h5"
"eval_cfg.num=100" "eval_cfg.save_traj=False" "eval_cfg.save_video=True"
the new error came like:
Traceback (most recent call last):
File "maniskill2_learn/apis/run_rl.py", line 487, in <module>
main()
File "maniskill2_learn/apis/run_rl.py", line 452, in main
run_one_process(0, 1, args, cfg)
File "maniskill2_learn/apis/run_rl.py", line 432, in run_one_process
main_rl(rollout, evaluator, replay, args, cfg, expert_replay=expert_replay, recent_traj_replay=recent_traj_replay)
File "maniskill2_learn/apis/run_rl.py", line 254, in main_rl
agent = build_agent(cfg.agent_cfg)
File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/methods/builder.py", line 12, in build_agent
return build_from_cfg(cfg, agent_type, default_args)
File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/utils/meta/registry.py", line 136, in build_from_cfg
return obj_cls(**args)
File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/methods/mfrl/ppo.py", line 137, in __init__
assert key in self.demo_replay.memory, f"DAPG needs {key} in your demo!"
TypeError: argument of type 'NoneType' is not iterable
Are your demo path and env name correct? You save as PushChair but use MoveBucket env name and demo path.
Oh, my apologies for my carelessness. Although I have corrected the demo path and env name, the previous error still exists.
PushChair-v1-train - (run_rl.py:258) - INFO - 2023-05-05,06:51:26 - Num of parameters: 0.48M, Model Size: 1.90M
PushChair-v1-train - (run_rl.py:286) - INFO - 2023-05-05,06:51:33 - Work directory of this run ../ManiSkill2-Learn-main/Result/PushChair-v1/dapg_pointcloud
PushChair-v1-train - (run_rl.py:288) - INFO - 2023-05-05,06:51:33 - Train over GPU [3]!
PushChair-v1-train - (train_rl.py:180) - INFO - 2023-05-05,06:51:47 - Rollout state dim: {'xyz': (5, 1200, 3), 'rgb': (5, 1200, 3), 'frame_related_states': (5, 3, 3), 'to_frames': (5, 3, 4, 4), 'state': (5, 47)}, action dim: (5, 20)!
PushChair-v1-train - (train_rl.py:241) - INFO - 2023-05-05,06:51:47 - Begin training!
PushChair-v1-train - (rollout.py:117) - INFO - 2023-05-05,07:21:24 - Finish with 20000 samples, simulation time/FPS:1718.08/11.64, agent time/FPS:28.96/690.56, overhead time:10.26
PushChair-v1-train - (train_rl.py:282) - INFO - 2023-05-05,07:21:24 - Replay buffer shape: {'obs': {'xyz': (20000, 1200, 3), 'rgb': (20000, 1200, 3), 'frame_related_states': (20000, 3, 3), 'to_frames': (20000, 3, 4, 4), 'state': (20000, 47)}, 'next_obs': {'xyz': (20000, 1200, 3), 'rgb': (20000, 1200, 3), 'frame_related_states': (20000, 3, 3), 'to_frames': (20000, 3, 4, 4), 'state': (20000, 47)}, 'actions': (20000, 20), 'rewards': (20000, 1), 'dones': (20000, 1), 'episode_dones': (20000, 1), 'infos': {'elapsed_steps': (20000, 1), 'success': (20000, 1), 'chair_close_to_target': (20000, 1), 'chair_standing': (20000, 1), 'chair_static': (20000, 1), 'dist_chair_to_target': (20000, 1), 'chair_tilt': (20000, 1), 'chair_vel_norm': (20000, 1), 'chair_ang_vel_norm': (20000, 1), 'dist_ee_to_chair': (20000, 1), 'action_norm': (20000, 1), 'cos_chair_vel_to_target': (20000, 1), 'stage_reward': (20000, 1), 'reward': (20000, 1), 'TimeLimit.truncated': (20000, 1)}, 'worker_indices': (20000, 1), 'is_truncated': (20000, 1)}.
PushChair-v1-train - (ppo.py:364) - INFO - 2023-05-05,07:21:25 - Number of batches in one PPO epoch: 61!
PushChair-v1-train - (ppo.py:451) - INFO - 2023-05-05,07:21:26 - **Warming up critic at the beginning of training; this causes reported ETA to be slower than actual ETA**
PushChair-v1-train - (train_rl.py:371) - INFO - 2023-05-05,07:22:07 - 20000/5000000(0%) Passed time:30m20s ETA:5d5h53m10s samples_stats: rewards:-2088.6[-2213.9, -1854.6], max_single_R:-9.46[-10.30, -5.12], lens:200[200, 200], success:0.00 gpu_mem_ratio: 89.5% gpu_mem: 35.43G gpu_mem_this: 11.40G gpu_util: 8% old_log_p: -8.239 adv_mean: -41.156 adv_std: 8.160 max_normed_adv: 6.999 v_target: -5.543 ori_returns: -166.138 grad_norm: 20.886 clipped_grad_norm: 0.486 critic_loss: 1.615 critic_mse: 3.229 critic_err: 2.891e-02 policy_std: 0.368 entropy: 8.392 mean_p_ratio: 0.983 max_p_ratio: 3.581 log_p: -8.453 clip_frac: 0.478 approx_kl: 0.156 actor_loss: 2.583e-02 entropy_loss: 0 demo_nll_loss: 26.356 demo_actor_loss: 2.636 visual_grad: 0.449 actor_mlp_grad: 4.375 critic_mlp_grad: 6.330 max_policy_abs: 1.029 policy_norm: 34.622 max_critic_abs: 1.029 critic_norm: 33.534 num_actor_epoch: 1.000 demo_lambda: 0.100 episode_time: 1800.168 collect_sample_time: 1757.711 memory: 10.55G
PushChair-v1-train - (rollout.py:117) - INFO - 2023-05-05,07:53:53 - Finish with 20000 samples, simulation time/FPS:1855.60/10.78, agent time/FPS:23.93/835.70, overhead time:9.30
PushChair-v1-train - (ppo.py:364) - INFO - 2023-05-05,07:53:53 - Number of batches in one PPO epoch: 61!
Traceback (most recent call last):
File "maniskill2_learn/apis/run_rl.py", line 487, in <module>
main()
File "maniskill2_learn/apis/run_rl.py", line 452, in main
run_one_process(0, 1, args, cfg)
File "maniskill2_learn/apis/run_rl.py", line 432, in run_one_process
main_rl(rollout, evaluator, replay, args, cfg, expert_replay=expert_replay, recent_traj_replay=recent_traj_replay)
File "maniskill2_learn/apis/run_rl.py", line 293, in main_rl
train_rl(
File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/apis/train_rl.py", line 300, in train_rl
training_infos = agent.update_parameters(replay, updates=total_updates, **extra_args)
File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-new/maniskill2_learn/methods/mfrl/ppo.py", line 392, in update_parameters
demo_memory["dones"] = demo_memory["dones"] * 0
TypeError: 'NoneType' object is not subscriptable
The environment shouldn't run this slow... Did you check the cpu usage on your machine?
Also currently to fix the issue, set the cache_size
and capacity
of the demo replay buffer to be smaller or equal to the total number of observations in the demo.
Thanks to your kind assistance, the problem had been solved! As for the issue of running too slow, I had cleaned up my GPU and set the rollout=20
, but it still seems too slow:
(train_rl.py:371) - INFO - 2023-05-07,01:31:59 - 20000/5000000(0%) Passed time:10m19s ETA:1d18h50m53s samples_stats: rewards:-1711.6[-2268.0, -690.9], max_single_R:-6.05[-8.68, -1.98], lens:200[200, 200], success:0.06 gpu_mem_ratio: 59.1% gpu_mem: 23.42G gpu_mem_this: 17.92G gpu_util: 2% old_log_p: -8.272 adv_mean: -3.233 adv_std: 15.937 max_normed_adv: 4.263 v_target: -7.672 ori_returns: -168.904 grad_norm: 11.013 clipped_grad_norm: 0.500 critic_loss: 0.275 critic_mse: 0.551 critic_err: 0.409 policy_std: 0.397 entropy: 9.819 mean_p_ratio: 1.068 max_p_ratio: 10.066 log_p: -8.528 clip_frac: 0.723 approx_kl: 0.298 actor_loss: 0.111 entropy_loss: 0 demo_nll_loss: 17.095 demo_actor_loss: 1.331 visual_grad: 4.738 actor_mlp_grad: 8.533 critic_mlp_grad: 5.445 max_policy_abs: 1.149 policy_norm: 36.183 max_critic_abs: 1.095 critic_norm: 34.873 num_actor_epoch: 1.000 demo_lambda: 7.783e-02 episode_time: 604.302 collect_sample_time: 566.976 memory: 23.34G
What can I do to speed it up?
set rollout_cfg.num_procs
not too large (otherwise cpu load is too much but #cpu is not enough) and not too small
OK, thank you for your kind assistance!
Hi, when I trained a DAPG agent under "env_name=PegInsertionSide-v0" using the script:
the error came out like:
How can I solve it?