NVlabs / handover-sim2real

Official code for CVPR'23 paper: Learning Human-to-Robot Handovers from Point Clouds
https://handover-sim2real.github.io
Other
70 stars 13 forks source link

What is simultaneous setting? #4

Open wzf2022 opened 9 months ago

wzf2022 commented 9 months ago

In your paper, you discuss the "simultaneous" setting, where the robot is permitted to move from the beginning of the episode. However, in the provided codebase, the _TIME_WAIT parameter is set to 1.5 seconds for the simultaneous setting and 3.0 seconds for the sequential setting (source). I'm curious about why there is a non-zero _TIME_WAIT for the simultaneous setting.

Additionally, in the "Reproducing CVPR 2023 Results" section, the paper provides some .npz files. Is it possible to independently test the model and generate these .npz files using the specified final model directories (e.g., output/cvpr2023_models/2022-10-16_08-48-30_finetune_5_s0_train)? Can the testing command below be used for this purpose?

GADDPG_DIR=GA-DDPG CUDA_VISIBLE_DEVICES=0 python examples/test.py \
  --model-dir output/cvpr2023_models/2022-10-16_08-48-30_finetune_5_s0_train \
  --without-hold \
  SIM.RENDER True

I would appreciate any clarification on these points.

ychao-nvidia commented 9 months ago

I'm curious about why there is a non-zero _TIME_WAIT for the simultaneous setting.

The simultaneous setting (aka w/o hold in handover-sim) is indeed intended to permit the robot to move from the beginning of the episode. Meanwhile, we observe that humans are generally moving faster than the robot. Therefore, it is difficult to avoid collision if the robot approaches the human while the human is also approaching the robot. The ideal behavior would be for the robot to directly move to the handover location (i.e. it has to somehow predict where that is), wait for the human hand to arrive if needed, and grasp the object immediately after the human hand stops in the handover pose. While we were thinking about this goal, our models have not yet exhibit this type of behavior, possibly due to the lack of such supervision in training. Therefore, there could still be a good number of failures coming from grasping while the human is still moving, or even before the human picks up the object from the table. To avoid such cases, we hand coded the behavior of the policy to hold still in the first 1.5 seconds, to ensure the robot won't attempt to grasp before the human picks the object up. That said, how we can train the robot to achieve the ideal behavior mentioned above would be an interesting research direction.

Is it possible to independently test the model and generate these .npz files using the specified final model directories (e.g., output/cvpr2023_models/2022-10-16_08-48-30_finetune_5_s0_train)? Can the testing command below be used for this purpose?

Yes, that is exactly what the first part of the Testing section in README.md is about. To generate *.npz, you need to set BENCHMARK.SAVE_RESULT to True. Look at the third bullet point in that section:

GADDPG_DIR=GA-DDPG CUDA_VISIBLE_DEVICES=0 python examples/test.py \
  --model-dir output/cvpr2023_models/2022-10-16_08-48-30_finetune_5_s0_train \
  --without-hold \
  --name finetune_5 \
  BENCHMARK.SAVE_RESULT True
wzf2022 commented 9 months ago

Thank you for your detailed response!

I agree that trainning the robot to achieve the ideal behavior would be an interesting research direction. Regarding the finetuning stage, I'm curious if the robot remains stationary during the initial 1.5 seconds. I haven't come across specific parameters controlling this behavior. Given that during testing in the simultaneous setting, the robot maintains stillness for the initial 1.5 seconds, I wonder whether this behavior is consistent during the training phase.

ychao-nvidia commented 9 months ago

Regarding the finetuning stage, I'm curious if the robot remains stationary during the initial 1.5 seconds.

Yes, the behavior is consistent during the finetuning stage.

I haven't come across specific parameters controlling this behavior.

Instead of letting the robot to wait for 1.5 second, we directly start the scene (i.e. mocap data) from 1.5 second (which also saves simulation cycles in training). The parameter for that lives in the config file of the handover-sim submodule here. This alters the starting frame of YCB (here) and MANO (here). As for training, we set this parameter for pretraining here, and for finetuning here. As you can see, for pretraining, we shart from the last frame, and for finetuning, we start from one and half second. For evaluation, we always start from the first frame.

GenH2R commented 8 months ago

Hello, I followed your instructions and made a modification in the example/finetune YAML file by changing the value of YCB_MANO_START_FRAME from 'one_and_half_second' to 'first.' However, I encountered an error, which is not present when YCB_MANO_START_FRAME is set to 'one_and_half_second.'":

Traceback (most recent call last): File "examples/train.py", line 535, in main() File "examples/train.py", line 499, in main res = ray.get(refs) File "/home/haoran/anaconda3/envs/sammy/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper return func(*args, **kwargs) File "/home/haoran/anaconda3/envs/sammy/lib/python3.8/site-packages/ray/_private/worker.py", line 2275, in get raise value.as_instanceof_cause() ray.exceptions.RayTaskError(KeyError): ray::ActorWrapperRemote.rollout() (pid=130042, ip=10.210.5.7, repr=<train.ActorWrapperRemote object at 0x7f45a7c5e580>) File "examples/train.py", line 88, in rollout self._rollout_one(explore, test, noise_scale) File "examples/train.py", line 225, in _rollout_one obs, reward, done, info = self._step_env_repeat( File "examples/train.py", line 319, in _step_env_repeat obs, reward, done, info = self._env.step(target_joint_position) File "/share/haoran/HRI/cvpr2023/handover-sim/handover/benchmark_wrapper.py", line 59, in step observation, reward, done, info = super().step(action) File "/home/haoran/anaconda3/envs/sammy/lib/python3.8/site-packages/easysim/simulator_env.py", line 103, in step return self.env.step(action) File "/home/haoran/anaconda3/envs/sammy/lib/python3.8/site-packages/gym/wrappers/order_enforcing.py", line 13, in step observation, reward, done, info = self.env.step(action) File "/home/haoran/anaconda3/envs/sammy/lib/python3.8/site-packages/easysim/simulator_env.py", line 68, in step self.pre_step(action) File "/share/haoran/HRI/cvpr2023/handover-sim/handover/handover_env.py", line 129, in pre_step self.mano.step(self.simulator) File "/share/haoran/HRI/cvpr2023/handover-sim/handover/mano.py", line 133, in step for i in range(-1, simulator._num_links[self.body.name] - 1): KeyError: '20200820-subject-03_right'

My command is GADDPG_DIR=GA-DDPG OMG_PLANNER_DIR=OMG-Planner CUDA_VISIBLE_DEVICES=7 python examples/train.py --cfg-file examples/finetune.yaml --seed 1 --use-ray --use-grasp-predictor --pretrained-dir output/cvpr2023_models/2022-09-30_11-54-42_pretrain_1_s0_train BENCHMARK.SETUP s0

GenH2R commented 8 months ago

I attempted to address the bug and retrain the model when using the 'first' setting for YCB_MANO_START_FRAME. However, I observed that the success rate under simultaneous settings was only around 20%. Is that a normal result? As ychao-nvidia mentioned, there could still be a good number of failures coming from grasping while the human is still moving, or even before the human picks up the object from the table.

ychao-nvidia commented 8 months ago

I followed your instructions and made a modification in the example/finetune YAML file by changing the value of YCB_MANO_START_FRAME from 'one_and_half_second' to 'first.' However, I encountered an error, which is not present when YCB_MANO_START_FRAME is set to 'one_and_half_second.'"

This should have been fixed in this commit.

I attempted to address the bug and retrain the model when using the 'first' setting for YCB_MANO_START_FRAME. However, I observed that the success rate under simultaneous settings was only around 20%.

This is congruent to our experience if I remember correctly.