cftang0827 / pedestrian-detection-ssdlite

Use TensorFlow object detection API and MobileNet SSDLite model to train a pedestrian detector by using VOC 2007 + 2012 dataset
MIT License
35 stars 12 forks source link

Out of memory when training the mobilenetv2-ssdlite? #2

Open SyGoing opened 5 years ago

SyGoing commented 5 years ago

cuda out of memory: the log is like this 2019-05-09 15:24:17.105701: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 6.25G (6712326144 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2019-05-09 15:24:17.534339: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 5.63G (6041093120 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2019-05-09 15:24:17.942804: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 5.06G (5436983808 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY

cftang0827 commented 5 years ago

It seems like your GPU have limited memory. You can adjust the batch_size of the training configuration file.

pedestrian_detection_ssdlite/train/ssdlite_mobilenet_v2_coco.config

train_config: {
  batch_size: 24
  optimizer {
    rms_prop_optimizer: {
      learning_rate: {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.004
          decay_steps: 800720
          decay_factor: 0.95
        }
      }
      momentum_optimizer_value: 0.9
      decay: 0.9
      epsilon: 1.0
    }
  }
  fine_tune_checkpoint: "ssdlite_mobilenet_v2_coco_2018_05_09/model.ckpt"
  fine_tune_checkpoint_type:  "detection"
  # Note: The below line limits the training process to 200K steps, which we
  # empirically found to be sufficient enough to train the pets dataset. This
  # effectively bypasses the learning rate schedule (the learning rate will
  # never decay). Remove the below line to train indefinitely.
  num_steps: 200000
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
}

The default value of the batch size is 24, if you are using the GPU with lower memory, you may adust the batch_size to 5 or so.

Thanks.