haosulab / ManiSkill2-Learn

Apache License 2.0
77 stars 15 forks source link

Reproducing BC baseline results on soft body envs #5

Open etaoxing opened 1 year ago

etaoxing commented 1 year ago

I'm having trouble reproducing results on Pinch-v0. I was able to get Write-v0 and Hang-v0 working though.

Here's the commands I'm running: demo conversion with general_soft_body_envs.txt and scripts/example_training/bc_soft_body_pointcloud.sh:

python maniskill2_learn/apis/run_rl.py configs/brl/bc/rgbd_soft_body.py \
--work-dir workdir/ --gpu-ids 0 \
--cfg-options "env_cfg.env_name=Pinch-v0" "env_cfg.obs_mode=rgbd" "env_cfg.n_points=1200" \
"env_cfg.reward_mode=dense" \
"env_cfg.control_mode=pd_ee_delta_pose" \
"replay_cfg.buffer_filenames=../ManiSkill2/demos/soft_body_envs/Pinch-v0/trajectory.none.pd_ee_delta_pose_rgbd.h5" \
"replay_cfg.num_samples=50" "replay_cfg.cache_size=1024" \
"eval_cfg.num=100" "eval_cfg.save_traj=False" "eval_cfg.save_video=True" \
"train_cfg.n_eval=50000" "train_cfg.total_steps=50000" "train_cfg.n_checkpoint=50000" "train_cfg.n_updates=500"

I've also tried with env_cfg.control_mode=pd_ee_target_delta_pose

How much memory would be needed to run with replay_cfg.num_samples=-1? Or is there a better way of training with all 1500+ demos using replay_cfg.dynamic_loading=True?

xuanlinli17 commented 1 year ago

Pinch-v0 BC demos contain target images, so it consumes lots of memory.

I recommend modify the demo replay buffer config file in this case:

demo_replay_cfg=dict(
    type="ReplayMemory",
    capacity=int(2e4),
    num_samples=-1,
    cache_size=int(2e4),
    dynamic_loading=True,
    synchronized=False,
    keys=["obs", "actions", "dones", "episode_dones"],
    buffer_filenames=[
        "PATH_TO_DEMO.h5",
    ],
),

i.e. thru demo_replay_cfg.dynamic_loading=True demo_replay_cfg.capacity=20000 demo_replay_cfg.cache_size=20000 demo_replay_cfg.num_samples=-1; this will load all demo data dynamically.

For BC, there is only 1 replay buffer, so replace the above demo_replay_cfg with replay_cfg.

Note that for non-BC algorithms, demo_replay_cfg is not the same as replay_cfg, i.e. demo replay buffer is a separate buffer from the (online) replay buffer for collecting online environment trajectories

etaoxing commented 1 year ago

I'm running on a machine with a 3090 and 64GB RAM, so I lowered to replay_cfg.capacity=5000 and replay_cfg.cache_size=5000. Pinch-v0/trajectory.none.pd_ee_delta_pose_pointcloud.h5 is 40GB.

python maniskill2_learn/apis/run_rl.py configs/brl/bc/pointnet_soft_body.py \
--work-dir workdir/ --gpu-ids 0 \
--cfg-options "env_cfg.env_name=Pinch-v0" "env_cfg.obs_mode=pointcloud" "env_cfg.n_points=1200" "env_cfg.obs_frame=ee" \
"env_cfg.reward_mode=dense" \
"env_cfg.control_mode=pd_ee_delta_pose" \
"replay_cfg.buffer_filenames=../ManiSkill2/demos/soft_body_envs/Pinch-v0/trajectory.none.pd_ee_delta_pose_pointcloud.h5" \
"replay_cfg.capacity=5000" "replay_cfg.num_samples=-1" "replay_cfg.cache_size=5000" \
"replay_cfg.dynamic_loading=True" "replay_cfg.synchronized=False" \
"eval_cfg.num=100" "eval_cfg.save_traj=False" "eval_cfg.save_video=True" \
"train_cfg.n_eval=50000" "train_cfg.total_steps=50000" "train_cfg.n_checkpoint=50000" "train_cfg.n_updates=500"

Still unable to train pointcloud BC baseline. GPU utilization shows 0%, occasionally increasing to 3-8%. Attached the log.

20230124_115441-train.log

xuanlinli17 commented 1 year ago

Does it report anything if you set train_cfg.n_updates=5?

If it reports, then it means it's training, it's just really slow due to file io.

BTW Is the demo stored on ssd?

xuanlinli17 commented 1 year ago

Also you can do some custom processing in env wrappers and implement new architectures if you implement your own approach, since Pinch-v0 indeed has the largest observation space among all envs (for default wrapper, we only downsample the observation point cloud, but not the target_rgb, target_points, or target_depth)

image

etaoxing commented 1 year ago

Yes, the demos are on root ssd.

Seems to start training, but grad_norm becomes 0 pretty quickly. True for env_cfg.control_mode=pd_ee_target_delta_pose and env_cfg.control_mode=pd_ee_delta_pose

python maniskill2_learn/apis/run_rl.py configs/brl/bc/pointnet_soft_body.py --work-dir workdir / 
--gpu-ids 0 --cfg-options "env_cfg.env_name=Pinch-v0" "env_cfg.obs_mode=pointcloud" \
"env_cfg.n_points=1200" "env_cfg.obs_frame=ee" \
"env_cfg.reward_mode=dense" "env_cfg.control_mode=pd_ee_target_delta_pose" \
"replay_cfg.buffer_filenames=../ManiSkill2/demos/soft_body_envs/Pinchv0/trajectory.none.pd_ee_target_delta_pose_pointcloud.h5" \
"replay_cfg.capacity=2000" "replay_cfg.num_samples=-1" "replay_cfg.cache_size=2000" \
"replay_cfg.dynamic_loading=True" "replay_cfg.synchronized=False" \
"eval_cfg.num=100" "eval_cfg.save_traj=False" "eval_cfg.save_video=True" "train_cfg.n_eval=50000" \ 
"train_cfg.total_steps=50000" "train_cfg.n_checkpoint=50000" "train_cfg.n_updates=10"

20230119_102001-train.log

xuanlinli17 commented 1 year ago

Was able to reproduce it for point cloud BC. Though for RGB-D BC, the gradient does not fall to zero. (RGB-D BC also requires more memory).