GT-RIPL / robo-vln

Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"
https://zubair-irshad.github.io/projects/robo-vln.html
MIT License
66 stars 8 forks source link

MemoryError: std::bad_alloc #8

Closed Moon-heart closed 1 year ago

Moon-heart commented 1 year ago

Thank you very much for your interesting research and sharing code. I encountered the following error when I run the command "python run.py --exp-config robo_vln_baselines/config/paper_configs/seq2seq_robo.yaml --run-type eval".

Traceback (most recent call last): File "run.py", line 79, in main() File "run.py", line 39, in main run_exp(vars(args)) File "run.py", line 73, in run_exp trainer.eval() File "/home/zzy/robo-vln/environments/habitat-lab/habitat_baselines/common/base_trainer.py", line 109, in eval checkpoint_index=prev_ckpt_ind, File "/home/zzy/robo-vln/robo_vln_baselines/robo_vln_trainer.py", line 1033, in _eval_checkpoint self.envs = construct_env(config) File "/home/zzy/robo-vln/robo_vln_baselines/common/env_utils.py", line 113, in construct_env env = VLNCEDaggerEnv(config) File "/home/zzy/robo-vln/robo_vln_baselines/common/environments.py", line 12, in init super().init(config.TASK_CONFIG, dataset) File "/home/zzy/robo-vln/environments/habitat-lab/habitat/core/env.py", line 331, in init self._env = Env(config, dataset) File "/home/zzy/robo-vln/environments/habitat-lab/habitat/core/env.py", line 105, in init id_sim=self._config.SIMULATOR.TYPE, config=self._config.SIMULATOR File "/home/zzy/robo-vln/environments/habitat-lab/habitat/sims/registration.py", line 19, in make_sim return _sim(kwargs) File "/home/zzy/robo-vln/environments/habitat-lab/habitat/sims/habitat_simulator/habitat_simulator.py", line 184, in init super().init(self.sim_config) File "", line 9, in init File "/home/zzy/.conda/envs/habitat/lib/python3.6/site-packages/habitat_sim-0.1.5-py3.6-linux-x86_64.egg/habitat_sim/simulator.py", line 87, in attrs_post_init self.set_from_config(self.config) File "/home/zzy/.conda/envs/habitat/lib/python3.6/site-packages/habitat_sim-0.1.5-py3.6-linux-x86_64.egg/habitat_sim/simulator.py", line 199, in __set_from_config self._config_backend(config) File "/home/zzy/.conda/envs/habitat/lib/python3.6/site-packages/habitat_sim-0.1.5-py3.6-linux-x86_64.egg/habitat_sim/simulator.py", line 136, in _config_backend super().init__(config.sim_cfg) MemoryError: std::bad_alloc

Then,I run "free -h",found sufficient memory. can you help me and give me some advice?

zubair-irshad commented 1 year ago

Hi @Moon-heart This looks like a habitat sim. Can you please open an issue here on their respective repository? Also we use a slightly outdated version of habitat-sim which is detailed here.

Moon-heart commented 1 year ago

hi, Thank you very much for your reply, inspired by you, I tried to reinstall Habitat-Lab and Habita-sim according to the way provided by Facebook's official website, and tested them according to the official website. I re-run the command"python run.py --exp-config robo_vln_baselines/config/paper_configs/robovln_data_train.yaml --run-type trainl", but the result still reported an error. The train.log output showed memory errors when initialization of the Habitat-sim.

train.log details are as follows: 2023-09-27 17:20:53,845 Initializing dataset VLN-CE-v1 2023-09-27 17:20:54,397 Simulator GPU ID [0] 2023-09-27 17:20:54,398 Simulator GPU ID 0 2023-09-27 17:20:54,398 [construct_envs] Using GPU ID 0 2023-09-27 17:20:54,398 Initializing dataset VLN-CE-v1 2023-09-27 17:20:54,947 initializing sim Sim-v0

output details are as follows: 2023-09-27 17:20:53,810 config: BASE_TASK_CONFIG_PATH: habitat_extensions/config/robo_vln_task.yaml CHECKPOINT_FOLDER: data/robo-vln/checkpoints/ CMD_TRAILING_OPTS: [] DAGGER: BATCH_SIZE: 1 CKPT_TO_LOAD: data/checkpoints/ckpt.0.pth COLLECT_DATA_SPLIT: train EPOCHS: 25 INTER_MODULE_ATTN: False ITERATIONS: 1 LMDB_COMMIT_FREQUENCY: 500 LMDB_EVAL_DIR: data/trajectories_dirs/debug/trajectories.lmdb LMDB_EVAL_SIZE: 100000000000.0 LMDB_FEATURES_DIR: data/trajectories_dirs/robo-vln/train/trajectories.lmdb LMDB_MAP_SIZE: 2700000000000.0 LMDB_STORE_FREQUENCY: 5 LOAD_FROM_CKPT: False LR: 0.0001 P: 1.0 PRELOAD_LMDB_FEATURES: False UPDATE_SIZE: 7739 USE_IW: True split_dim: 0 tbptt_steps: 100 time_step: 0.03333333333333333 DDP: dist_backend: nccl dist_url: env:// distributed: False gpu: 0 rank: 0 world_size: 1 ENV_NAME: VLNCEDaggerEnv EVAL: EPISODE_COUNT: 570 EVAL_NONLEARNING: False NONLEARNING: AGENT: RandomAgent SPLIT: val_seen USE_CKPT_CONFIG: False VAL_LOG_DIR: validation_logging/ EVAL_CKPT_PATH_DIR: data/robo-vln/checkpoints/ LOG_FILE: train.log MODEL: ACTION_DECODER_TRANFORMER: N: 1 d_ff: 1024 d_model: 512 dropout: 0.1 fc_output: 512 h: 4 in_features: 32 CMA: rcm_state_encoder: False use: False use_prev_action: False DEPTH_ENCODER: backbone: resnet50 cnn_type: VlnResnetDepthEncoder ddppo_checkpoint: data/ddppo-models/gibson-2plus-resnet50.pth output_size: 128 FLAT_AUX_LOSS: use: False HIERARCHICAL:

HYBRID_STATE_DECODER: N: 1 RNN_output_size: 512 d_ff: 1024 d_in: 512 d_model: 512 d_out: 256 dropout: 0.1 fc_output: 512 h: 4 hidden_size: 512 in_features: 512 prev_action_embedding_dim: 32 rnn_type: LSTM IMAGE_CROSS_MODAL_ENCODER: N: 1 d_ff: 1024 d_in: 512 d_model: 256 d_out: 256 dropout: 0.2 h: 2 INSTRUCTION_ENCODER: bidirectional: False dataset_vocab: data/datasets/R2R_VLNCE_v1_preprocessed/train/train.json.gz dropout_ratio: 0.25 embedding_file: data/datasets/robo_vln_v1/embeddings.json.gz embedding_size: 50 final_state_only: True fine_tune_embeddings: False hidden_size: 256 is_bert: True max_length: 200 num_layers: 1 rnn_type: LSTM use_pretrained_embeddings: True vocab_size: 2504 INTER_MODULE_ATTN: N: 1 d_ff: 1024 d_model: 512 dropout: 0.1 fc_output: 512 h: 4 in_features: 512 LANG_ATTN: hidden_size: 256 use: False PROGRESS_MONITOR: alpha: 1.0 use: False RGB_ENCODER: cnn_type: TorchVisionResNet50 output_size: 256 resnet_output_size: 256 SEM_ATTN_ENCODER: hidden_size: 256 use: False SEM_MAP_TRANSFORMER: N: 1 d_ff: 1024 d_in: 128 d_model: 512 d_out: 256 downsample_size: 20 dropout: 0.1 embedding_dim: 128 h: 4 layer_norm_eps: 1e-12 n_output: 512 SEM_TEXT_ATTN: hidden_size: 256 use: False SEQ2SEQ: use_prev_action: False STATE_ENCODER: hidden_size: 512 rnn_type: LSTM TRANSFORMER: hidden_size: 512 lr: 0.0001 lr_drop: 4 output_size: 512 scheduler_patience: 0.0001 split_gpus: False use: False use_prev_action: True weight_decay: 0.001 TRANSFORMER_INSTRUCTION_ENCODER: N: 1 d_ff: 1024 d_in: 768 d_model: 256 dropout: 0.2 h: 4 is_bert: True VISUAL_LING_ATTN: N: 1 d_ff: 1024 d_model: 256 dropout: 0.25 fc_output: 512 h: 4 ins_in_features: 768 vis_in_features: 256 ablate_depth: False ablate_instruction: False ablate_rgb: False ablate_sem_attn: False inflection_weight_coef: 3.2 NUM_PROCESSES: 1 PLOT_ATTENTION: False SENSORS: ['RGB_SENSOR', 'DEPTH_SENSOR'] SIMULATOR_GPU_ID: [0] TASK_CONFIG: DATASET: CONTENT_SCENES: ['*'] DATA_PATH: data/datasets/robo_vln_v1/{split}/{split}.json.gz SCENES_DIR: data/scene_datasets/ SPLIT: train TYPE: VLN-CE-v1 ENVIRONMENT: ITERATOR_OPTIONS: CYCLE: True GROUP_BY_SCENE: True MAX_SCENE_REPEAT_EPISODES: -1 MAX_SCENE_REPEAT_STEPS: 10000 NUM_EPISODE_SAMPLE: -1 SHUFFLE: True STEP_REPETITION_RANGE: 0.2 MAX_EPISODE_SECONDS: 10000000 MAX_EPISODE_STEPS: 1000 PYROBOT: BASE_CONTROLLER: proportional BASE_PLANNER: none BUMP_SENSOR: TYPE: PyRobotBumpSensor DEPTH_SENSOR: CENTER_CROP: False HEIGHT: 480 MAX_DEPTH: 5.0 MIN_DEPTH: 0.0 NORMALIZE_DEPTH: True TYPE: PyRobotDepthSensor WIDTH: 640 LOCOBOT: ACTIONS: ['BASE_ACTIONS', 'CAMERA_ACTIONS'] BASE_ACTIONS: ['go_to_relative', 'go_to_absolute'] CAMERA_ACTIONS: ['set_pan', 'set_tilt', 'set_pan_tilt'] RGB_SENSOR: CENTER_CROP: False HEIGHT: 480 TYPE: PyRobotRGBSensor WIDTH: 640 ROBOT: locobot ROBOTS: ['locobot'] SENSORS: ['RGB_SENSOR', 'DEPTH_SENSOR', 'BUMP_SENSOR'] SEED: 100 SIMULATOR: ACTION_SPACE_CONFIG: v0 AGENTS: ['AGENT_0'] AGENT_0: ANGULAR_ACCELERATION: 12.56 ANGULAR_FRICTION: 1.0 COEFFICIENT_OF_RESTITUTION: 0.0 HEIGHT: 1.5 IS_SET_START_STATE: False LINEAR_ACCELERATION: 20.0 LINEAR_FRICTION: 0.5 MASS: 32.0 RADIUS: 0.1 SENSORS: ['RGB_SENSOR', 'DEPTH_SENSOR'] START_POSITION: [0, 0, 0] START_ROTATION: [0, 0, 0, 1] DEFAULT_AGENT_ID: 0 DEPTH_SENSOR: HEIGHT: 256 HFOV: 90 MAX_DEPTH: 10.0 MIN_DEPTH: 0.0 NORMALIZE_DEPTH: True ORIENTATION: [0.0, 0.0, 0.0] POSITION: [0, 1.25, 0] TYPE: HabitatSimDepthSensor WIDTH: 256 FORWARD_STEP_SIZE: 0.25 HABITAT_SIM_V0: ALLOW_SLIDING: True ENABLE_PHYSICS: False GPU_DEVICE_ID: 0 GPU_GPU: False PHYSICS_CONFIG_FILE: ./data/default.phys_scene_config.json RGB_SENSOR: HEIGHT: 224 HFOV: 90 ORIENTATION: [0.0, 0.0, 0.0] POSITION: [0, 1.25, 0] TYPE: HabitatSimRGBSensor WIDTH: 224 SCENE: data/scene_datasets/habitat-test-scenes/van-gogh-room.glb SEED: 100 SEMANTIC_SENSOR: HEIGHT: 480 HFOV: 90 ORIENTATION: [0.0, 0.0, 0.0] POSITION: [0, 1.25, 0] TYPE: HabitatSimSemanticSensor WIDTH: 640 TILT_ANGLE: 15 TURN_ANGLE: 15 TYPE: Sim-v0 TASK: ACTIONS: ANSWER: TYPE: AnswerAction LOOK_DOWN: TYPE: LookDownAction LOOK_UP: TYPE: LookUpAction MOVE_FORWARD: TYPE: MoveForwardAction STOP: TYPE: StopAction TELEPORT: TYPE: TeleportAction TURN_LEFT: TYPE: TurnLeftAction TURN_RIGHT: TYPE: TurnRightAction ANSWER_ACCURACY: TYPE: AnswerAccuracy COLLISIONS: TYPE: Collisions COMPASS_SENSOR: TYPE: CompassSensor CORRECT_ANSWER: TYPE: CorrectAnswer DISTANCE_TO_GOAL: DISTANCE_TO: POINT TYPE: DistanceToGoal EPISODE_INFO: TYPE: EpisodeInfo GLOBAL_GPS_SENSOR: DIMENSIONALITY: 3 TYPE: GlobalGPSSensor GOAL_SENSOR_UUID: pointgoal GPS_SENSOR: DIMENSIONALITY: 2 TYPE: GPSSensor HEADING_SENSOR: TYPE: HeadingSensor IMAGEGOAL_SENSOR: TYPE: ImageGoalSensor INSTRUCTION_SENSOR: TYPE: InstructionSensor INSTRUCTION_SENSOR_UUID: instruction MEASUREMENTS: ['DISTANCE_TO_GOAL', 'SUCCESS', 'SPL', 'PATH_LENGTH', 'NAVIGATION_ERROR', 'STEPS_TAKEN'] NAVIGATION_ERROR: TYPE: NavigationError NDTW: FDTW: True GT_PATH: data/datasets/robo_vln_v1/{split}/{split}_gt.json.gz SPLIT: val_seen SUCCESS_DISTANCE: 3.0 TYPE: NDTW OBJECTGOAL_SENSOR: GOAL_SPEC: TASK_CATEGORY_ID GOAL_SPEC_MAX_VAL: 50 TYPE: ObjectGoalSensor ORACLE_ACTION_SENSOR: GOAL_RADIUS: 0.5 TYPE: OracleActionSensor ORACLE_NAVIGATION_ERROR: TYPE: OracleNavigationError ORACLE_SPL: SUCCESS_DISTANCE: 0.2 TYPE: OracleSPL ORACLE_SUCCESS: SUCCESS_DISTANCE: 3.0 TYPE: OracleSuccess PATH_LENGTH: TYPE: PathLength POINTGOAL_SENSOR: DIMENSIONALITY: 2 GOAL_FORMAT: POLAR TYPE: PointGoalSensor POINTGOAL_WITH_GPS_COMPASS_SENSOR: DIMENSIONALITY: 2 GOAL_FORMAT: POLAR TYPE: PointGoalWithGPSCompassSensor POSSIBLE_ACTIONS: ['STOP', 'MOVE_FORWARD', 'TURN_LEFT', 'TURN_RIGHT'] PROXIMITY_SENSOR: MAX_DETECTION_RADIUS: 2.0 TYPE: ProximitySensor QUESTION_SENSOR: TYPE: QuestionSensor SDTW: FDTW: True GT_PATH: data/datasets/robo_vln_v1/{split}/{split}_gt.json.gz SPLIT: val_seen SUCCESS_DISTANCE: 3.0 TYPE: SDTW SENSORS: ['INSTRUCTION_SENSOR', 'VLN_ORACLE_ACTION_SENSOR', 'VLN_ORACLE_PROGRESS_SENSOR', 'HEADING_SENSOR'] SOFT_SPL: TYPE: SoftSPL SPL: SUCCESS_DISTANCE: 3.0 TYPE: SPL STEPS_TAKEN: TYPE: StepsTaken SUCCESS: SUCCESS_DISTANCE: 3.0 TYPE: Success SUCCESS_DISTANCE: 3.0 TOP_DOWN_MAP: DRAW_BORDER: True DRAW_GOAL_AABBS: True DRAW_GOAL_POSITIONS: True DRAW_SHORTEST_PATH: True DRAW_SOURCE: True DRAW_VIEW_POINTS: True FOG_OF_WAR: DRAW: True FOV: 90 VISIBILITY_DIST: 5.0 MAP_PADDING: 3 MAP_RESOLUTION: 1250 MAX_EPISODE_STEPS: 1000 NUM_TOPDOWN_MAP_SAMPLE_POINTS: 20000 TYPE: TopDownMap TYPE: VLN-v0 VLN_ORACLE_ACTION_SENSOR: GOAL_RADIUS: 0.5 TYPE: VLNOracleActionSensor VLN_ORACLE_PROGRESS_SENSOR: TYPE: VLNOracleProgressSensor TENSORBOARD_DIR: data/robo-vln/tensorboard_dirs/ TORCH_GPU_ID: 0 TRAINER_NAME: robo_vln_trainer VIDEO_DIR: VIDEO_OPTION: [] 2023-09-27 17:20:53,845 Initializing dataset VLN-CE-v1 2023-09-27 17:20:54,397 Simulator GPU ID [0] 2023-09-27 17:20:54,398 Simulator GPU ID 0 2023-09-27 17:20:54,398 [construct_envs] Using GPU ID 0 2023-09-27 17:20:54,398 Initializing dataset VLN-CE-v1 2023-09-27 17:20:54,947 initializing sim Sim-v0 WARNING: Logging before InitGoogleLogging() is written to STDERR I0927 17:20:54.952857 288376 AssetAttributesManager.cpp:121] Asset attributes (capsule3DSolid) created and registered. I0927 17:20:54.952905 288376 AssetAttributesManager.cpp:121] Asset attributes (capsule3DWireframe) created and registered. I0927 17:20:54.952929 288376 AssetAttributesManager.cpp:121] Asset attributes (coneSolid) created and registered. I0927 17:20:54.952944 288376 AssetAttributesManager.cpp:121] Asset attributes (coneWireframe) created and registered. I0927 17:20:54.952952 288376 AssetAttributesManager.cpp:121] Asset attributes (cubeSolid) created and registered. I0927 17:20:54.952960 288376 AssetAttributesManager.cpp:121] Asset attributes (cubeWireframe) created and registered. I0927 17:20:54.952997 288376 AssetAttributesManager.cpp:121] Asset attributes (cylinderSolid) created and registered. I0927 17:20:54.953015 288376 AssetAttributesManager.cpp:121] Asset attributes (cylinderWireframe) created and registered. I0927 17:20:54.953025 288376 AssetAttributesManager.cpp:121] Asset attributes (icosphereSolid) created and registered. I0927 17:20:54.953035 288376 AssetAttributesManager.cpp:121] Asset attributes (icosphereWireframe) created and registered. I0927 17:20:54.953048 288376 AssetAttributesManager.cpp:121] Asset attributes (uvSphereSolid) created and registered. I0927 17:20:54.953063 288376 AssetAttributesManager.cpp:121] Asset attributes (uvSphereWireframe) created and registered. I0927 17:20:54.953068 288376 AssetAttributesManager.cpp:108] AssetAttributesManager::buildCtorFuncPtrMaps : Built default primitive asset templates : 12 I0927 17:20:54.953444 288376 PhysicsAttributesManager.cpp:38] File (./data/default.phys_scene_config.json) not found so new, default physics manager attributes created and registered. I0927 17:20:54.953497 288376 StageAttributesManager.cpp:74] File (data/scene_datasets/mp3d/1pXnuDYAj8r/1pXnuDYAj8r.glb) Based stage attributes created and registered. W0927 17:20:54.953509 288376 Simulator.cpp:132] Navmesh file not found, checked at I0927 17:20:54.953531 288376 SceneGraph.h:92] Created DrawableGroup: Renderer: NVIDIA RTX A6000/PCIe/SSE2 by NVIDIA Corporation OpenGL version: 4.6.0 NVIDIA 535.54.03 Using optional features: GL_ARB_ES2_compatibility GL_ARB_direct_state_access GL_ARB_get_texture_sub_image GL_ARB_invalidate_subdata GL_ARB_multi_bind GL_ARB_robustness GL_ARB_separate_shader_objects GL_ARB_texture_filter_anisotropic GL_ARB_texture_storage GL_ARB_texture_storage_multisample GL_ARB_vertex_array_object GL_KHR_debug Using driver workarounds: no-forward-compatible-core-context nv-egl-incorrect-gl11-function-pointers no-layout-qualifiers-on-old-glsl nv-zero-context-profile-mask nv-implementation-color-read-format-dsa-broken nv-cubemap-inconsistent-compressed-image-size nv-cubemap-broken-full-compressed-image-query nv-compressed-block-size-in-bits I0927 17:20:55.005424 288376 ResourceManager.cpp:920] Importing Basis files as BC7 I0927 17:20:55.005686 288376 PhysicsManager.cpp:33] Deconstructing PhysicsManager I0927 17:20:55.005698 288376 SceneManager.h:24] Deconstructing SceneManager I0927 17:20:55.005703 288376 SceneGraph.h:25] Deconstructing SceneGraph I0927 17:20:55.005818 288376 Renderer.cpp:33] Deconstructing Renderer I0927 17:20:55.005823 288376 WindowlessContext.h:16] Deconstructing WindowlessContext Traceback (most recent call last): File "run.py", line 79, in main() File "run.py", line 39, in main run_exp(vars(args)) File "run.py", line 71, in run_exp trainer.train() File "/home/zzy/robo-vln/robo_vln_baselines/robo_vln_trainer.py", line 858, in train self.envs = construct_env(self.config) File "/home/zzy/robo-vln/robo_vln_baselines/common/env_utils.py", line 113, in construct_env env = VLNCEDaggerEnv(config) File "/home/zzy/robo-vln/robo_vln_baselines/common/environments.py", line 12, in init super().init(config.TASK_CONFIG, dataset) File "/home/zzy/robo-vln/environments/habitat-lab/habitat/core/env.py", line 333, in init self._env = Env(config, dataset) File "/home/zzy/robo-vln/environments/habitat-lab/habitat/core/env.py", line 104, in init id_sim=self._config.SIMULATOR.TYPE, config=self._config.SIMULATOR File "/home/zzy/robo-vln/environments/habitat-lab/habitat/sims/registration.py", line 19, in make_sim return _sim(kwargs) File "/home/zzy/robo-vln/environments/habitat-lab/habitat/sims/habitat_simulator/habitat_simulator.py", line 185, in init self._sim = habitat_sim.Simulator(self.sim_config) File "", line 9, in init File "/home/zzy/.conda/envs/habitat1/lib/python3.6/site-packages/habitat_sim-0.1.5-py3.6-linux-x86_64.egg/habitat_sim/simulator.py", line 87, in attrs_post_init self.set_from_config(self.config) File "/home/zzy/.conda/envs/habitat1/lib/python3.6/site-packages/habitat_sim-0.1.5-py3.6-linux-x86_64.egg/habitat_sim/simulator.py", line 199, in __set_from_config self._config_backend(config) File "/home/zzy/.conda/envs/habitat1/lib/python3.6/site-packages/habitat_sim-0.1.5-py3.6-linux-x86_64.egg/habitat_sim/simulator.py", line 136, in _config_backend super().init__(config.sim_cfg) MemoryError: std::bad_alloc 段错误 (核心已转储) (habitat1) zzy@new-server:~/robo-vln$

Can you give me some suggestions? Thank you so much.