facebookresearch / OccupancyAnticipation

This repository contains code for our publication "Occupancy Anticipation for Efficient Exploration and Navigation" in ECCV 2020.
MIT License
76 stars 26 forks source link

How to set appropriate values in .yaml file to avoid low GPU-uti but high GPU Memory-usage? #35

Closed AgentEXPL closed 3 years ago

AgentEXPL commented 3 years ago

I used torch==1.9.0 and cuda=11.1, with two A40 GPUs. The code could work if "fill_value" is removed from the inputs to scatter_max function in Line 78 in mapnet.py . The fps is nearly 30.

What confuses me is that the "GPU-util" is extremely low (<10%) and the "GPU Memory-usage" is high (nearly 50%) when training a model (e.g., the "ans-depth" model). The GPU-util of the GPU for SIMULATOR and MAPPER is nearly 0. Is this normal or is there something wrong due to the change of installation environments?

What should I do to improve the GPU-util? Is it possible to improve the GPU-util by setting the config values in .yaml file? What's the relationship between RL.ANS.MAPPER.replay_size and map_batch_size and NUM_PROCESSES? I have no idea on how to setting appropriate values for "replay_size" and "map_batch_size". It will be of great help is some explanations could be provided.

AgentEXPL commented 3 years ago

I would like to close this issue sicen I got the appropriate values after trying several times.

srama2512 commented 3 years ago

Hi @AgentEXPL. Thank you for bringing up this issue. I've only trained this model with 8 GPUs and 16/32GB memory per GPU. So I don't have a good answer to this. But this is something valuable to know for others in the community. Can you send a pull request to the README on how you resolved this?

AgentEXPL commented 3 years ago

Hi @srama2512 . Actually, I am not sure whether my answer is the right answer. I only test two models: ans_depth model and occant_depth model. From my experience, the GPU for SIMULATOR would occupy relatively high GPU-memory with low GPU-util when the NUM_PROCESSES is relatively high. The NUM_PROCESSES can be set according to the total memory of the GPU, where each process would occupy nearly 2~3G memory. As for the GPU-util resource, it is mainly occupied by the MAPPER if occupancy anticipation model is used when ans_depth and occant_depth is compared, the occant_depth is able to achieve high GPU-util.

The following is one of my settings. I would like to set the GPU be used for both mapper and simulator if I do not have enough GPUs. Setting 1: two A40 GPUs, each of which has 48 memory. BASE_TASK_CONFIG_PATH: "configs/exploration/gibson_train.yaml" TRAINER_NAME: "occant_exp" ENV_NAME: "ExpRLEnv" SIMULATOR_GPU_ID: 4 SIMULATOR_GPU_IDS: [4,5] TORCH_GPU_ID: 4 VIDEO_OPTION: ["disk", "tensorboard"] TENSORBOARD_DIR: "tb" VIDEO_DIR: "video_dir" EVAL_CKPT_PATH_DIR: "data/new_checkpoints" NUM_PROCESSES: 18 SENSORS: ["RGB_SENSOR", "DEPTH_SENSOR"] CHECKPOINT_FOLDER: "data/new_checkpoints" NUM_EPISODES: 10000 T_EXP: 1000

RL: PPO:

ppo params

ppo_epoch: 4
num_mini_batch: 4

ANS:

Uncomment this for anticipation reward

# reward_type: "map_accuracy"
image_scale_hw: [128, 128]
MAPPER:
  map_size: 65
  registration_type: "moving_average"
  label_id: "ego_map_gt_anticipated"
  ignore_pose_estimator: False
  map_batch_size: 120
  use_data_parallel: True
  replay_size: 100000
  gpu_ids: [4,5]
OCCUPANCY_ANTICIPATOR:
  type: "occant_depth"