Issue while running run_eval_variations.sh

pawanw17 commented 5 months ago

Hi, I was trying to run the script using

bash run_eval_variations.sh $min_var_num $max_var_num $total_processes $processes_per_gpu

but I am facing an issue with data/test_variations_final, is there a link to download the directory? or a script? Can you please share that?

robot-colosseum commented 5 months ago

Hi @pawanw17 . Sorry for the delayed response. It seems that you don't have the dataset generated in the required path. Sorry, we don't mention this part in the README. Here is a link to the reference in the docs, more specifically this part.

There is a script in the Colosseum repo called collect_dataset.sh, which you can use to generate the dataset using the variations. Make sure that you have setup the colosseum repo in your virtual environment. You might need to comment the first 3 lines if you're not running in headless mode. Also, might be better to make the path stored in SAVE_PATH point to the path you use in your eval script.

Let us know if this fixed the problem, or if you have additional issues with any of these repos.

pawanw17 commented 4 months ago

Hi I generated the dataset for open_drawer, and placed it inside inside data/test_variations_final, this is how it looks ├── rvt │ ├── configs │ ├── data │ │ └── test_variations_final │ │ ├── open_drawer_0 │ │ ├── open_drawer_1 │ │ ├── open_drawer_10 │ │ ├── open_drawer_11 │ │ ├── open_drawer_12 │ │ ├── open_drawer_13 │ │ ├── open_drawer_14 │ │ ├── open_drawer_15 │ │ ├── open_drawer_16 │ │ ├── open_drawer_17 │ │ ├── open_drawer_2 │ │ ├── open_drawer_6 │ │ ├── open_drawer_8 │ │ └── open_drawer_9 but on running the script I am getting this error

MVT Vars: {'training': True, '_parameters': OrderedDict(), '_buffers': OrderedDict(), '_non_persistent_buffers_set': set(), '_backward_hooks': OrderedDict(), '_is_full_backward_hook': None, '_forward_hooks': OrderedDict(), '_forward_pre_hooks': OrderedDict(), '_state_dict_hooks': OrderedDict(), '_load_state_dict_pre_hooks': OrderedDict(), '_load_state_dict_post_hooks': OrderedDict(), '_modules': OrderedDict(), 'depth': 8, 'img_feat_dim': 3, 'img_size': 220, 'add_proprio': True, 'proprio_dim': 4, 'add_lang': True, 'lang_dim': 512, 'lang_len': 77, 'im_channels': 64, 'img_patch_size': 11, 'final_dim': 64, 'attn_dropout': 0.1, 'decoder_dropout': 0.0, 'self_cross_ver': 1, 'add_corr': True, 'add_pixel_loc': True, 'add_depth': True, 'pe_fix': True}
Agent Information
<rvt.models.rvt_agent.RVTAgent object at 0x7f4c8cdd46a0>
[CoppeliaSim:loadinfo]   done.
Traceback (most recent call last):
  File "eval.py", line 579, in <module>
    _eval(args)
  File "eval.py", line 533, in _eval
    scores = eval(
  File "/home/pawan/miniconda3/envs/rvt_new/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "eval.py", line 316, in eval
    raise e
  File "eval.py", line 309, in eval
    for replay_transition in generator:
  File "/home/pawan/devel/rrc/act3d/colosseum/rvt_colosseum/rvt/libs/YARR/yarr/utils/rollout_generator.py", line 40, in generator
    obs = env.reset_to_demo(eval_demo_seed)
  File "/home/pawan/devel/rrc/act3d/colosseum/rvt_colosseum/rvt/utils/custom_rlbench_env.py", line 441, in reset_to_demo
    self._task.set_variation(d.variation_number)
AttributeError: 'Demo' object has no attribute 'variation_number'

wpumacay commented 4 months ago

Oh. It's most likely that you're using a different version of RLBench. You should use the fork from the PerAct authors here. Note in this file that they define an extra field for the variation number (this variation number is the variation from RLBench tasks, not ours). You might not have installed the RLBench version from the git submodule. You'll have to collect the dataset for the open_drawer task again. Let me know if you run into any issues.

pawanw17 commented 4 months ago

Hi @wpumacay Your hunch about me using RLbench from upstream was correct. I tried using your data but unfortunately got a different error message

ritical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-06-18 17:44:58.892076: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-06-18 17:44:59.451147: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-06-18 17:44:59.454092: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
MVT Vars: {'training': True, '_parameters': OrderedDict(), '_buffers': OrderedDict(), '_non_persistent_buffers_set': set(), '_backward_hooks': OrderedDict(), '_is_full_backward_hook': None, '_forward_hooks': OrderedDict(), '_forward_pre_hooks': OrderedDict(), '_state_dict_hooks': OrderedDict(), '_load_state_dict_pre_hooks': OrderedDict(), '_load_state_dict_post_hooks': OrderedDict(), '_modules': OrderedDict(), 'depth': 8, 'img_feat_dim': 3, 'img_size': 220, 'add_proprio': True, 'proprio_dim': 4, 'add_lang': True, 'lang_dim': 512, 'lang_len': 77, 'im_channels': 64, 'img_patch_size': 11, 'final_dim': 64, 'attn_dropout': 0.1, 'decoder_dropout': 0.0, 'self_cross_ver': 1, 'add_corr': True, 'add_pixel_loc': True, 'add_depth': True, 'pe_fix': True}
MVT Vars: {'training': True, '_parameters': OrderedDict(), '_buffers': OrderedDict(), '_non_persistent_buffers_set': set(), '_backward_hooks': OrderedDict(), '_is_full_backward_hook': None, '_forward_hooks': OrderedDict(), '_forward_pre_hooks': OrderedDict(), '_state_dict_hooks': OrderedDict(), '_load_state_dict_pre_hooks': OrderedDict(), '_load_state_dict_post_hooks': OrderedDict(), '_modules': OrderedDict(), 'depth': 8, 'img_feat_dim': 3, 'img_size': 220, 'add_proprio': True, 'proprio_dim': 4, 'add_lang': True, 'lang_dim': 512, 'lang_len': 77, 'im_channels': 64, 'img_patch_size': 11, 'final_dim': 64, 'attn_dropout': 0.1, 'decoder_dropout': 0.0, 'self_cross_ver': 1, 'add_corr': True, 'add_pixel_loc': True, 'add_depth': True, 'pe_fix': True}
Agent Information
<rvt.models.rvt_agent.RVTAgent object at 0x7f8658f90820>
Agent Information
<rvt.models.rvt_agent.RVTAgent object at 0x7f02ed59b820>
[CoppeliaSim:loadinfo]   done.
Traceback (most recent call last):
  File "eval.py", line 579, in <module>
    _eval(args)
  File "eval.py", line 533, in _eval
    scores = eval(
  File "/home/pawan/miniconda3/envs/rvt/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "eval.py", line 316, in eval
    raise e
  File "eval.py", line 309, in eval
    for replay_transition in generator:
  File "/home/pawan/devel/rrc/act3d/colosseum/rvt_colosseum/rvt/libs/YARR/yarr/utils/rollout_generator.py", line 40, in generator
    obs = env.reset_to_demo(eval_demo_seed)
  File "/home/pawan/devel/rrc/act3d/colosseum/rvt_colosseum/rvt/utils/custom_rlbench_env.py", line 436, in reset_to_demo
    d = self._task.get_demos(
  File "/home/pawan/devel/rrc/act3d/colosseum/robot-colosseum/colosseum/rlbench/extensions/task_environment.py", line 47, in get_demos
    demos = utils.get_stored_demos(
  File "/home/pawan/devel/rrc/act3d/colosseum/RLBench/rlbench/utils.py", line 121, in get_stored_demos
    listdir(r_sh_depth_f)) == len(listdir(oh_rgb_f)) == len(
FileNotFoundError: [Errno 2] No such file or directory: 'data/test_variations_final/open_drawer_8/variation0/episodes/episode0/overhead_rgb'
[CoppeliaSim:loadinfo]   done.
Traceback (most recent call last):
  File "eval.py", line 579, in <module>
    _eval(args)
  File "eval.py", line 533, in _eval
    scores = eval(
  File "/home/pawan/miniconda3/envs/rvt/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "eval.py", line 316, in eval
    raise e
  File "eval.py", line 309, in eval
    for replay_transition in generator:
  File "/home/pawan/devel/rrc/act3d/colosseum/rvt_colosseum/rvt/libs/YARR/yarr/utils/rollout_generator.py", line 40, in generator
    obs = env.reset_to_demo(eval_demo_seed)
  File "/home/pawan/devel/rrc/act3d/colosseum/rvt_colosseum/rvt/utils/custom_rlbench_env.py", line 436, in reset_to_demo
    d = self._task.get_demos(
  File "/home/pawan/devel/rrc/act3d/colosseum/robot-colosseum/colosseum/rlbench/extensions/task_environment.py", line 47, in get_demos
    demos = utils.get_stored_demos(
  File "/home/pawan/devel/rrc/act3d/colosseum/RLBench/rlbench/utils.py", line 121, in get_stored_demos
    listdir(r_sh_depth_f)) == len(listdir(oh_rgb_f)) == len(
FileNotFoundError: [Errno 2] No such file or directory: 'data/test_variations_final/open_drawer_2/variation0/episodes/episode0/overhead_rgb'

wpumacay commented 4 months ago

Hi @pawanw17 , sorry that's another issue on my end. I have to recollect the data with the right configuration of cameras to be used. I'll double check with my colleagues what their data collection configuration was for RVT. It seems that it's different that the default one that I use.

jesbu1 commented 4 months ago

seems to be fixed by simply commenting out lines 104, 105, and 106 of https://github.com/MohitShridhar/RLBench/blob/4bf93b8167b47b41f5ba744c5a1be00660cb203f/rlbench/utils.py

Looks like the file expects to see overhead camera info but it's not saved by colosseum. The overhead cameras aren't used to train RVT according to the peract_cfg anyway.

pawanw17 commented 4 months ago

Umm... this leads to another issue

(rvt) pawan@pawan-Legion:~/devel/rrc/colosseum_ws/rvt_colosseum/rvt$ 2024-06-20 17:35:59.687583: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-06-20 17:35:59.687581: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-06-20 17:36:00.420600: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-06-20 17:36:00.420600: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
MVT Vars: {'training': True, '_parameters': OrderedDict(), '_buffers': OrderedDict(), '_non_persistent_buffers_set': set(), '_backward_hooks': OrderedDict(), '_is_full_backward_hook': None, '_forward_hooks': OrderedDict(), '_forward_pre_hooks': OrderedDict(), '_state_dict_hooks': OrderedDict(), '_load_state_dict_pre_hooks': OrderedDict(), '_load_state_dict_post_hooks': OrderedDict(), '_modules': OrderedDict(), 'depth': 8, 'img_feat_dim': 3, 'img_size': 220, 'add_proprio': True, 'proprio_dim': 4, 'add_lang': True, 'lang_dim': 512, 'lang_len': 77, 'im_channels': 64, 'img_patch_size': 11, 'final_dim': 64, 'attn_dropout': 0.1, 'decoder_dropout': 0.0, 'self_cross_ver': 1, 'add_corr': True, 'add_pixel_loc': True, 'add_depth': True, 'pe_fix': True}
MVT Vars: {'training': True, '_parameters': OrderedDict(), '_buffers': OrderedDict(), '_non_persistent_buffers_set': set(), '_backward_hooks': OrderedDict(), '_is_full_backward_hook': None, '_forward_hooks': OrderedDict(), '_forward_pre_hooks': OrderedDict(), '_state_dict_hooks': OrderedDict(), '_load_state_dict_pre_hooks': OrderedDict(), '_load_state_dict_post_hooks': OrderedDict(), '_modules': OrderedDict(), 'depth': 8, 'img_feat_dim': 3, 'img_size': 220, 'add_proprio': True, 'proprio_dim': 4, 'add_lang': True, 'lang_dim': 512, 'lang_len': 77, 'im_channels': 64, 'img_patch_size': 11, 'final_dim': 64, 'attn_dropout': 0.1, 'decoder_dropout': 0.0, 'self_cross_ver': 1, 'add_corr': True, 'add_pixel_loc': True, 'add_depth': True, 'pe_fix': True}
Agent Information
<rvt.models.rvt_agent.RVTAgent object at 0x7fcca614bdf0>
Agent Information
<rvt.models.rvt_agent.RVTAgent object at 0x7fafdcf192b0>
[CoppeliaSim:loadinfo]   done.
Traceback (most recent call last):
  File "eval.py", line 579, in <module>
    _eval(args)
  File "eval.py", line 533, in _eval
    scores = eval(
  File "/home/pawan/miniconda3/envs/rvt/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "eval.py", line 316, in eval
    raise e
  File "eval.py", line 309, in eval
    for replay_transition in generator:
  File "/home/pawan/devel/rrc/colosseum_ws/rvt_colosseum/rvt/libs/YARR/yarr/utils/rollout_generator.py", line 52, in generator
    act_result = agent.act(step_signal.value, prepped_data,
  File "/home/pawan/miniconda3/envs/rvt/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/pawan/devel/rrc/colosseum_ws/rvt_colosseum/rvt/models/rvt_agent.py", line 727, in act
    a, b = mvt_utils.place_pc_in_cube(
  File "/home/pawan/devel/rrc/colosseum_ws/rvt_colosseum/rvt/mvt/utils.py", line 48, in place_pc_in_cube
    pc_mid = (torch.max(pc, 0)[0] + torch.min(pc, 0)[0]) / 2
IndexError: max(): Expected reduction dim 0 to have non-zero size.
[CoppeliaSim:loadinfo]   done.
Traceback (most recent call last):
  File "eval.py", line 579, in <module>
    _eval(args)
  File "eval.py", line 533, in _eval
    scores = eval(
  File "/home/pawan/miniconda3/envs/rvt/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "eval.py", line 316, in eval
    raise e
  File "eval.py", line 309, in eval
    for replay_transition in generator:
  File "/home/pawan/devel/rrc/colosseum_ws/rvt_colosseum/rvt/libs/YARR/yarr/utils/rollout_generator.py", line 52, in generator
    act_result = agent.act(step_signal.value, prepped_data,
  File "/home/pawan/miniconda3/envs/rvt/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/pawan/devel/rrc/colosseum_ws/rvt_colosseum/rvt/models/rvt_agent.py", line 727, in act
    a, b = mvt_utils.place_pc_in_cube(
  File "/home/pawan/devel/rrc/colosseum_ws/rvt_colosseum/rvt/mvt/utils.py", line 48, in place_pc_in_cube
    pc_mid = (torch.max(pc, 0)[0] + torch.min(pc, 0)[0]) / 2
IndexError: max(): Expected reduction dim 0 to have non-zero size.

wpumacay commented 4 months ago

Sorry for the delay @pawanw17 . I tried reproducing the issue but had no luck so far. I think you might be missing the other tasks data. I checked and it seems you can't run the eval in single-task mode, so most likely you're missing the other part of the dataset, which is generated using colosseum. Did you run into any issues using colosseum for the data collection?. I recall you mentioned that it crashed so you could just collect data for a single task.

robot-colosseum / rvt_colosseum

Issue while running run_eval_variations.sh #2