Error at: caffe2/core/context_gpu.cu:343: out of memory

ghost commented 6 years ago

Hi, thanks for the great work!

Ran to an out of memory issue when we were running test_net.py on COCO dataset with 2 TITAN X set up. Installations were fine and coco datasets were included in /lib/datasets/data/coco. A line was added in test_net.py to facilitate the use of 3rd and 4th GPUs. os.environ['CUDA_VISIBLE_DEVICES'] = "2,3"

We tried to run the code: ./tools/test_net.py --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml --multi-gpu-testing TEST.WEIGHTS https://s3-us-west-2.amazonaws.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl NUM_GPUS 2

and encountered error as such:

terminate called after throwing an instance of 'caffe2::EnforceNotMet'
  what():  [enforce fail at context_gpu.cu:343] error == cudaSuccess. 2 vs 0. Error at: /home/user/default/caffe2/caffe2/core/context_gpu.cu:343: out of memory Error from operator:
input: "gpu_0/roi_feat_fpn2" input: "gpu_0/roi_feat_fpn3" input: "gpu_0/roi_feat_fpn4" input: "gpu_0/roi_feat_fpn5" output: "gpu_0/roi_feat_shuffled" output: "gpu_0/_concat_roi_feat" name: "" type: "Concat" arg { name: "axis" i: 0 } device_option { device_type: 1 cuda_gpu_id: 0 }
*** Aborted at 1516700071 (unix time) try "date -d @1516700071" if you are using GNU date ***
PC: @     0x7faf48c0c428 gsignal
*** SIGABRT (@0x3e800001c16) received by PID 7190 (TID 0x7fae72ffd700) from PID 7190; stack trace: ***
    @     0x7faf48fb2390 (unknown)
    @     0x7faf48c0c428 gsignal
    @     0x7faf48c0e02a abort
    @     0x7faf45bf484d __gnu_cxx::__verbose_terminate_handler()
    @     0x7faf45bf26b6 (unknown)
    @     0x7faf45bf2701 std::terminate()
    @     0x7faf45c1dd38 (unknown)
    @     0x7faf48fa86ba start_thread
    @     0x7faf48cde3dd clone
    @                0x0 (unknown)
Aborted (core dumped)
Traceback (most recent call last):
  File "./tools/test_net.py", line 168, in <module>
    main(ind_range=args.range, multi_gpu_testing=args.multi_gpu_testing)
  File "./tools/test_net.py", line 133, in main
    results = parent_func(multi_gpu=multi_gpu_testing)
  File "/home/user/default/Detectron/lib/core/test_engine.py", line 59, in test_net_on_dataset
    num_images, output_dir
  File "/home/user/default/Detectron/lib/core/test_engine.py", line 82, in multi_gpu_test_net_on_dataset
    'detection', num_images, binary, output_dir
  File "/home/user/default/Detectron/lib/utils/subprocess.py", line 83, in process_in_parallel
    log_subprocess_output(i, p, output_dir, tag, start, end)
  File "/home/user/default/Detectron/lib/utils/subprocess.py", line 121, in log_subprocess_output
    assert ret == 0, 'Range subprocess failed (exit code: {})'.format(ret)
AssertionError: Range subprocess failed (exit code: 134)

Is there any settings that we are missing? Thank you!

jwnsu commented 6 years ago

For multiple gpus, it seems to support lowest cpu ids by default. Got error too when tried to specify other GPU ids (e.g. '1,3,5,7' instead of lowest '0,1,2,3'.) Your out-of-memory error seems to be overloaded GPU "0,1" (your setting "2,3" is ignored, still trying to run on gpu "0,1"), you can verify by checking nvidia-smi.

rbgirshick commented 6 years ago

Thanks for reporting this (it doesn't appear in our internal cluster environment and so we didn't catch the issue earlier). We will have a diff to fix it soon.

ir413 commented 6 years ago

Please see c941633.

jwnsu commented 6 years ago

Thx, lightening speed in response. However, after pulling the change, training with non-lowest GPU ids, still gave error (cuda gpu memory access error). Training with lowest gpu ids works fine.

ir413 commented 6 years ago

That might be a different issue but it's hard to tell. Could you please provide some more info on the issue? (error stack trace, GPU types used, etc.)

ghost commented 6 years ago

@ir413 Thanks, just to confirm the issue has been resolved.

Results: Dataset: coco_2014_minival Task: box AP,AP50,AP75,APs,APm,APl 0.4089,0.6193,0.4478,0.2350,0.4421,0.5389 Task: mask AP,AP50,AP75,APs,APm,APl 0.3639,0.5846,0.3869,0.1664,0.3915,0.5400

ir413 commented 6 years ago

@loackerc: Thanks for confirming.

@jwnsu: Thanks for reporting. This is indeed a different issue which we will address in #32.

gf19880710 commented 6 years ago

@ir413
Hello Ilija, These days i am trying to use Detectron for my own datasets, for the tutorial part everything works fine, no issue. but when i finished my own datasets (use labelme to labeling all images and get standard cocodata format), i am trying to train the model, i got the same issue instruction with @loackerc, please can you have a solution for my issue? below is all the log for my issue:

python tools/train_net.py --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-50-FPN_1x.yaml OUTPUT_DIR out_dir NUM_GPUS 1
Found Detectron ops lib: /home/gengfeng/anaconda3/envs/caffe2_detectron/lib/python2.7/site-packages/torch/lib/libcaffe2_detectron_ops_gpu.so
[E init_intrinsics_check.cc:43] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
[E init_intrinsics_check.cc:43] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
[E init_intrinsics_check.cc:43] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
INFO train_net.py:  95: Called with args:
INFO train_net.py:  96: Namespace(cfg_file='configs/12_2017_baselines/e2e_mask_rcnn_R-50-FPN_1x.yaml', multi_gpu_testing=False, opts=['OUTPUT_DIR', 'out_dir', 'NUM_GPUS', '1'], skip_test=False)
INFO train_net.py: 103: cuda version : 9000
INFO train_net.py: 104: cudnn version: 7102
INFO train_net.py: 105: nvidia-smi output:
Wed Oct 24 17:06:45 2018       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.87                 Driver Version: 390.87                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 00000000:01:00.0  On |                  N/A |
| 42%   38C    P8    16W / 200W |   1044MiB /  8116MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1116      G   /usr/lib/xorg/Xorg                            18MiB |
|    0      1158      G   /usr/bin/gnome-shell                          49MiB |
|    0      3935      G   /usr/lib/xorg/Xorg                           329MiB |
|    0      4067      G   /usr/bin/gnome-shell                         401MiB |
|    0      4599      G   ...quest-channel-token=3154153029565847270   109MiB |
|    0      5691      G   ...passed-by-fd --v8-snapshot-passed-by-fd    12MiB |
|    0     10650      G   ...passed-by-fd --v8-snapshot-passed-by-fd    22MiB |
|    0     10727      G   /opt/teamviewer/tv_bin/TeamViewer             20MiB |
|    0     15117      G   .../pycharm-professional/89/jre64/bin/java    21MiB |
|    0     15906      G   ...passed-by-fd --v8-snapshot-passed-by-fd    12MiB |
|    0     21679      G   ...passed-by-fd --v8-snapshot-passed-by-fd    20MiB |
|    0     24765      G   ...passed-by-fd --v8-snapshot-passed-by-fd    12MiB |
+-----------------------------------------------------------------------------+
INFO train_net.py: 106: Training with config:
INFO train_net.py: 107: {'BBOX_XFORM_CLIP': 4.135166556742356,
 'CLUSTER': {'ON_CLUSTER': False},
 'DATA_LOADER': {'BLOBS_QUEUE_CAPACITY': 8,
                 'MINIBATCH_QUEUE_SIZE': 64,
                 'NUM_THREADS': 4},
 'DEDUP_BOXES': 0.0625,
 'DOWNLOAD_CACHE': '/tmp/detectron-download-cache',
 'EPS': 1e-14,
 'EXPECTED_RESULTS': [],
 'EXPECTED_RESULTS_ATOL': 0.005,
 'EXPECTED_RESULTS_EMAIL': '',
 'EXPECTED_RESULTS_RTOL': 0.1,
 'EXPECTED_RESULTS_SIGMA_TOL': 4,
 'FAST_RCNN': {'CONV_HEAD_DIM': 256,
               'MLP_HEAD_DIM': 1024,
               'NUM_STACKED_CONVS': 4,
               'ROI_BOX_HEAD': 'fast_rcnn_heads.add_roi_2mlp_head',
               'ROI_XFORM_METHOD': 'RoIAlign',
               'ROI_XFORM_RESOLUTION': 7,
               'ROI_XFORM_SAMPLING_RATIO': 2},
 'FPN': {'COARSEST_STRIDE': 32,
         'DIM': 256,
         'EXTRA_CONV_LEVELS': False,
         'FPN_ON': True,
         'MULTILEVEL_ROIS': True,
         'MULTILEVEL_RPN': True,
         'ROI_CANONICAL_LEVEL': 4,
         'ROI_CANONICAL_SCALE': 224,
         'ROI_MAX_LEVEL': 5,
         'ROI_MIN_LEVEL': 2,
         'RPN_ANCHOR_START_SIZE': 32,
         'RPN_ASPECT_RATIOS': (0.5, 1, 2),
         'RPN_MAX_LEVEL': 6,
         'RPN_MIN_LEVEL': 2,
         'USE_GN': False,
         'ZERO_INIT_LATERAL': False},
 'GROUP_NORM': {'DIM_PER_GP': -1, 'EPSILON': 1e-05, 'NUM_GROUPS': 32},
 'KRCNN': {'CONV_HEAD_DIM': 256,
           'CONV_HEAD_KERNEL': 3,
           'CONV_INIT': 'GaussianFill',
           'DECONV_DIM': 256,
           'DECONV_KERNEL': 4,
           'DILATION': 1,
           'HEATMAP_SIZE': -1,
           'INFERENCE_MIN_SIZE': 0,
           'KEYPOINT_CONFIDENCE': 'bbox',
           'LOSS_WEIGHT': 1.0,
           'MIN_KEYPOINT_COUNT_FOR_VALID_MINIBATCH': 20,
           'NMS_OKS': False,
           'NORMALIZE_BY_VISIBLE_KEYPOINTS': True,
           'NUM_KEYPOINTS': -1,
           'NUM_STACKED_CONVS': 8,
           'ROI_KEYPOINTS_HEAD': '',
           'ROI_XFORM_METHOD': 'RoIAlign',
           'ROI_XFORM_RESOLUTION': 7,
           'ROI_XFORM_SAMPLING_RATIO': 0,
           'UP_SCALE': -1,
           'USE_DECONV': False,
           'USE_DECONV_OUTPUT': False},
 'MATLAB': 'matlab',
 'MEMONGER': True,
 'MEMONGER_SHARE_ACTIVATIONS': False,
 'MODEL': {'BBOX_REG_WEIGHTS': (10.0, 10.0, 5.0, 5.0),
           'CLS_AGNOSTIC_BBOX_REG': False,
           'CONV_BODY': 'FPN.add_fpn_ResNet50_conv5_body',
           'EXECUTION_TYPE': 'dag',
           'FASTER_RCNN': True,
           'KEYPOINTS_ON': False,
           'MASK_ON': True,
           'NUM_CLASSES': 2,
           'RPN_ONLY': False,
           'TYPE': 'generalized_rcnn'},
 'MRCNN': {'CLS_SPECIFIC_MASK': True,
           'CONV_INIT': 'MSRAFill',
           'DILATION': 1,
           'DIM_REDUCED': 256,
           'RESOLUTION': 28,
           'ROI_MASK_HEAD': 'mask_rcnn_heads.mask_rcnn_fcn_head_v1up4convs',
           'ROI_XFORM_METHOD': 'RoIAlign',
           'ROI_XFORM_RESOLUTION': 14,
           'ROI_XFORM_SAMPLING_RATIO': 2,
           'THRESH_BINARIZE': 0.5,
           'UPSAMPLE_RATIO': 1,
           'USE_FC_OUTPUT': False,
           'WEIGHT_LOSS_MASK': 1.0},
 'NUM_GPUS': 1,
 'OUTPUT_DIR': 'out_dir',
 'PIXEL_MEANS': array([[[102.9801, 115.9465, 122.7717]]]),
 'RESNETS': {'NUM_GROUPS': 1,
             'RES5_DILATION': 1,
             'SHORTCUT_FUNC': 'basic_bn_shortcut',
             'STEM_FUNC': 'basic_bn_stem',
             'STRIDE_1X1': True,
             'TRANS_FUNC': 'bottleneck_transformation',
             'WIDTH_PER_GROUP': 64},
 'RETINANET': {'ANCHOR_SCALE': 4,
               'ASPECT_RATIOS': (0.5, 1.0, 2.0),
               'BBOX_REG_BETA': 0.11,
               'BBOX_REG_WEIGHT': 1.0,
               'CLASS_SPECIFIC_BBOX': False,
               'INFERENCE_TH': 0.05,
               'LOSS_ALPHA': 0.25,
               'LOSS_GAMMA': 2.0,
               'NEGATIVE_OVERLAP': 0.4,
               'NUM_CONVS': 4,
               'POSITIVE_OVERLAP': 0.5,
               'PRE_NMS_TOP_N': 1000,
               'PRIOR_PROB': 0.01,
               'RETINANET_ON': False,
               'SCALES_PER_OCTAVE': 3,
               'SHARE_CLS_BBOX_TOWER': False,
               'SOFTMAX': False},
 'RFCN': {'PS_GRID_SIZE': 3},
 'RNG_SEED': 3,
 'ROOT_DIR': '/home/gengfeng/Desktop/projects/DETECTRON',
 'RPN': {'ASPECT_RATIOS': (0.5, 1, 2),
         'RPN_ON': True,
         'SIZES': (64, 128, 256, 512),
         'STRIDE': 16},
 'SOLVER': {'BASE_LR': 0.02,
            'GAMMA': 0.1,
            'LOG_LR_CHANGE_THRESHOLD': 1.1,
            'LRS': [],
            'LR_POLICY': 'steps_with_decay',
            'MAX_ITER': 90000,
            'MOMENTUM': 0.9,
            'SCALE_MOMENTUM': True,
            'SCALE_MOMENTUM_THRESHOLD': 1.1,
            'STEPS': [0, 60000, 80000],
            'STEP_SIZE': 30000,
            'WARM_UP_FACTOR': 0.3333333333333333,
            'WARM_UP_ITERS': 500,
            'WARM_UP_METHOD': 'linear',
            'WEIGHT_DECAY': 0.0001,
            'WEIGHT_DECAY_GN': 0.0},
 'TEST': {'BBOX_AUG': {'AREA_TH_HI': 32400,
                       'AREA_TH_LO': 2500,
                       'ASPECT_RATIOS': (),
                       'ASPECT_RATIO_H_FLIP': False,
                       'COORD_HEUR': 'UNION',
                       'ENABLED': False,
                       'H_FLIP': False,
                       'MAX_SIZE': 4000,
                       'SCALES': (),
                       'SCALE_H_FLIP': False,
                       'SCALE_SIZE_DEP': False,
                       'SCORE_HEUR': 'UNION'},
          'BBOX_REG': True,
          'BBOX_VOTE': {'ENABLED': False,
                        'SCORING_METHOD': 'ID',
                        'SCORING_METHOD_BETA': 1.0,
                        'VOTE_TH': 0.8},
          'COMPETITION_MODE': True,
          'DATASETS': ('labelme_val',),
          'DETECTIONS_PER_IM': 100,
          'FORCE_JSON_DATASET_EVAL': False,
          'KPS_AUG': {'AREA_TH': 32400,
                      'ASPECT_RATIOS': (),
                      'ASPECT_RATIO_H_FLIP': False,
                      'ENABLED': False,
                      'HEUR': 'HM_AVG',
                      'H_FLIP': False,
                      'MAX_SIZE': 4000,
                      'SCALES': (),
                      'SCALE_H_FLIP': False,
                      'SCALE_SIZE_DEP': False},
          'MASK_AUG': {'AREA_TH': 32400,
                       'ASPECT_RATIOS': (),
                       'ASPECT_RATIO_H_FLIP': False,
                       'ENABLED': False,
                       'HEUR': 'SOFT_AVG',
                       'H_FLIP': False,
                       'MAX_SIZE': 4000,
                       'SCALES': (),
                       'SCALE_H_FLIP': False,
                       'SCALE_SIZE_DEP': False},
          'MAX_SIZE': 1333,
          'NMS': 0.5,
          'PRECOMPUTED_PROPOSALS': False,
          'PROPOSAL_FILES': (),
          'PROPOSAL_LIMIT': 2000,
          'RPN_MIN_SIZE': 0,
          'RPN_NMS_THRESH': 0.7,
          'RPN_POST_NMS_TOP_N': 1000,
          'RPN_PRE_NMS_TOP_N': 1000,
          'SCALE': 800,
          'SCORE_THRESH': 0.05,
          'SOFT_NMS': {'ENABLED': False, 'METHOD': 'linear', 'SIGMA': 0.5},
          'WEIGHTS': ''},
 'TRAIN': {'ASPECT_GROUPING': True,
           'AUTO_RESUME': True,
           'BATCH_SIZE_PER_IM': 64,
           'BBOX_THRESH': 0.5,
           'BG_THRESH_HI': 0.5,
           'BG_THRESH_LO': 0.0,
           'COPY_WEIGHTS': False,
           'CROWD_FILTER_THRESH': 0.7,
           'DATASETS': ('labelme_train',),
           'FG_FRACTION': 0.25,
           'FG_THRESH': 0.5,
           'FREEZE_AT': 2,
           'FREEZE_CONV_BODY': False,
           'GT_MIN_AREA': -1,
           'IMS_PER_BATCH': 2,
           'MAX_SIZE': 1333,
           'PROPOSAL_FILES': (),
           'RPN_BATCH_SIZE_PER_IM': 256,
           'RPN_FG_FRACTION': 0.5,
           'RPN_MIN_SIZE': 0,
           'RPN_NEGATIVE_OVERLAP': 0.3,
           'RPN_NMS_THRESH': 0.7,
           'RPN_POSITIVE_OVERLAP': 0.7,
           'RPN_POST_NMS_TOP_N': 2000,
           'RPN_PRE_NMS_TOP_N': 2000,
           'RPN_STRADDLE_THRESH': 0,
           'SCALES': (800,),
           'SNAPSHOT_ITERS': 20000,
           'USE_FLIPPED': True,
           'WEIGHTS': '/tmp/detectron-download-cache/ImageNetPretrained/MSRA/R-50.pkl'},
 'USE_NCCL': False,
 'VIS': False,
 'VIS_TH': 0.9}
INFO train.py: 144: Building model: generalized_rcnn
WARNING cnn.py:  25: [====DEPRECATE WARNING====]: you are creating an object from CNNModelHelper class which will be deprecated soon. Please use ModelHelper object with brew module. For more information, please refer to caffe2.ai and python/brew.py, python/brew_test.py for more information.
WARNING memonger.py:  55: NOTE: Executing memonger to optimize gradient memory
[I memonger.cc:236] Remapping 93 using 21 shared blobs.
INFO memonger.py:  97: Memonger memory optimization took 0.007819175720214844 secs
[I context_gpu.cu:318] GPU 0: 153 MB
[I context_gpu.cu:322] Total: 153 MB
[I context_gpu.cu:318] GPU 0: 320 MB
[I context_gpu.cu:322] Total: 320 MB
INFO train.py: 192: Loading dataset: ('labelme_train',)
loading annotations into memory...
Done (t=0.05s)
creating index...
index created!
INFO roidb.py:  49: Appending horizontally-flipped training examples...
INFO roidb.py:  51: Loaded dataset: labelme_train
INFO roidb.py: 135: Filtered 0 roidb entries: 3404 -> 3404
INFO roidb.py:  67: Computing bounding-box regression targets...
INFO roidb.py:  69: done
INFO train.py: 196: 3404 roidb entries
INFO net.py:  60: Loading weights from: /tmp/detectron-download-cache/ImageNetPretrained/MSRA/R-50.pkl
INFO net.py:  96: conv1_w loaded from weights file into gpu_0/conv1_w: (64, 3, 7, 7)
INFO net.py:  96: res_conv1_bn_s loaded from weights file into gpu_0/res_conv1_bn_s: (64,)
INFO net.py:  96: res_conv1_bn_b loaded from weights file into gpu_0/res_conv1_bn_b: (64,)
INFO net.py:  96: res2_0_branch2a_w loaded from weights file into gpu_0/res2_0_branch2a_w: (64, 64, 1, 1)
INFO net.py:  96: res2_0_branch2a_bn_s loaded from weights file into gpu_0/res2_0_branch2a_bn_s: (64,)
INFO net.py:  96: res2_0_branch2a_bn_b loaded from weights file into gpu_0/res2_0_branch2a_bn_b: (64,)
INFO net.py:  96: res2_0_branch2b_w loaded from weights file into gpu_0/res2_0_branch2b_w: (64, 64, 3, 3)
INFO net.py:  96: res2_0_branch2b_bn_s loaded from weights file into gpu_0/res2_0_branch2b_bn_s: (64,)
INFO net.py:  96: res2_0_branch2b_bn_b loaded from weights file into gpu_0/res2_0_branch2b_bn_b: (64,)
INFO net.py:  96: res2_0_branch2c_w loaded from weights file into gpu_0/res2_0_branch2c_w: (256, 64, 1, 1)
INFO net.py:  96: res2_0_branch2c_bn_s loaded from weights file into gpu_0/res2_0_branch2c_bn_s: (256,)
INFO net.py:  96: res2_0_branch2c_bn_b loaded from weights file into gpu_0/res2_0_branch2c_bn_b: (256,)
INFO net.py:  96: res2_0_branch1_w loaded from weights file into gpu_0/res2_0_branch1_w: (256, 64, 1, 1)
INFO net.py:  96: res2_0_branch1_bn_s loaded from weights file into gpu_0/res2_0_branch1_bn_s: (256,)
INFO net.py:  96: res2_0_branch1_bn_b loaded from weights file into gpu_0/res2_0_branch1_bn_b: (256,)
INFO net.py:  96: res2_1_branch2a_w loaded from weights file into gpu_0/res2_1_branch2a_w: (64, 256, 1, 1)
INFO net.py:  96: res2_1_branch2a_bn_s loaded from weights file into gpu_0/res2_1_branch2a_bn_s: (64,)
INFO net.py:  96: res2_1_branch2a_bn_b loaded from weights file into gpu_0/res2_1_branch2a_bn_b: (64,)
INFO net.py:  96: res2_1_branch2b_w loaded from weights file into gpu_0/res2_1_branch2b_w: (64, 64, 3, 3)
INFO net.py:  96: res2_1_branch2b_bn_s loaded from weights file into gpu_0/res2_1_branch2b_bn_s: (64,)
INFO net.py:  96: res2_1_branch2b_bn_b loaded from weights file into gpu_0/res2_1_branch2b_bn_b: (64,)
INFO net.py:  96: res2_1_branch2c_w loaded from weights file into gpu_0/res2_1_branch2c_w: (256, 64, 1, 1)
INFO net.py:  96: res2_1_branch2c_bn_s loaded from weights file into gpu_0/res2_1_branch2c_bn_s: (256,)
INFO net.py:  96: res2_1_branch2c_bn_b loaded from weights file into gpu_0/res2_1_branch2c_bn_b: (256,)
INFO net.py:  96: res2_2_branch2a_w loaded from weights file into gpu_0/res2_2_branch2a_w: (64, 256, 1, 1)
INFO net.py:  96: res2_2_branch2a_bn_s loaded from weights file into gpu_0/res2_2_branch2a_bn_s: (64,)
INFO net.py:  96: res2_2_branch2a_bn_b loaded from weights file into gpu_0/res2_2_branch2a_bn_b: (64,)
INFO net.py:  96: res2_2_branch2b_w loaded from weights file into gpu_0/res2_2_branch2b_w: (64, 64, 3, 3)
INFO net.py:  96: res2_2_branch2b_bn_s loaded from weights file into gpu_0/res2_2_branch2b_bn_s: (64,)
INFO net.py:  96: res2_2_branch2b_bn_b loaded from weights file into gpu_0/res2_2_branch2b_bn_b: (64,)
INFO net.py:  96: res2_2_branch2c_w loaded from weights file into gpu_0/res2_2_branch2c_w: (256, 64, 1, 1)
INFO net.py:  96: res2_2_branch2c_bn_s loaded from weights file into gpu_0/res2_2_branch2c_bn_s: (256,)
INFO net.py:  96: res2_2_branch2c_bn_b loaded from weights file into gpu_0/res2_2_branch2c_bn_b: (256,)
INFO net.py:  96: res3_0_branch2a_w loaded from weights file into gpu_0/res3_0_branch2a_w: (128, 256, 1, 1)
INFO net.py:  96: res3_0_branch2a_bn_s loaded from weights file into gpu_0/res3_0_branch2a_bn_s: (128,)
INFO net.py:  96: res3_0_branch2a_bn_b loaded from weights file into gpu_0/res3_0_branch2a_bn_b: (128,)
INFO net.py:  96: res3_0_branch2b_w loaded from weights file into gpu_0/res3_0_branch2b_w: (128, 128, 3, 3)
INFO net.py:  96: res3_0_branch2b_bn_s loaded from weights file into gpu_0/res3_0_branch2b_bn_s: (128,)
INFO net.py:  96: res3_0_branch2b_bn_b loaded from weights file into gpu_0/res3_0_branch2b_bn_b: (128,)
INFO net.py:  96: res3_0_branch2c_w loaded from weights file into gpu_0/res3_0_branch2c_w: (512, 128, 1, 1)
INFO net.py:  96: res3_0_branch2c_bn_s loaded from weights file into gpu_0/res3_0_branch2c_bn_s: (512,)
INFO net.py:  96: res3_0_branch2c_bn_b loaded from weights file into gpu_0/res3_0_branch2c_bn_b: (512,)
INFO net.py:  96: res3_0_branch1_w loaded from weights file into gpu_0/res3_0_branch1_w: (512, 256, 1, 1)
INFO net.py:  96: res3_0_branch1_bn_s loaded from weights file into gpu_0/res3_0_branch1_bn_s: (512,)
INFO net.py:  96: res3_0_branch1_bn_b loaded from weights file into gpu_0/res3_0_branch1_bn_b: (512,)
INFO net.py:  96: res3_1_branch2a_w loaded from weights file into gpu_0/res3_1_branch2a_w: (128, 512, 1, 1)
INFO net.py:  96: res3_1_branch2a_bn_s loaded from weights file into gpu_0/res3_1_branch2a_bn_s: (128,)
INFO net.py:  96: res3_1_branch2a_bn_b loaded from weights file into gpu_0/res3_1_branch2a_bn_b: (128,)
INFO net.py:  96: res3_1_branch2b_w loaded from weights file into gpu_0/res3_1_branch2b_w: (128, 128, 3, 3)
INFO net.py:  96: res3_1_branch2b_bn_s loaded from weights file into gpu_0/res3_1_branch2b_bn_s: (128,)
INFO net.py:  96: res3_1_branch2b_bn_b loaded from weights file into gpu_0/res3_1_branch2b_bn_b: (128,)
INFO net.py:  96: res3_1_branch2c_w loaded from weights file into gpu_0/res3_1_branch2c_w: (512, 128, 1, 1)
INFO net.py:  96: res3_1_branch2c_bn_s loaded from weights file into gpu_0/res3_1_branch2c_bn_s: (512,)
INFO net.py:  96: res3_1_branch2c_bn_b loaded from weights file into gpu_0/res3_1_branch2c_bn_b: (512,)
INFO net.py:  96: res3_2_branch2a_w loaded from weights file into gpu_0/res3_2_branch2a_w: (128, 512, 1, 1)
INFO net.py:  96: res3_2_branch2a_bn_s loaded from weights file into gpu_0/res3_2_branch2a_bn_s: (128,)
INFO net.py:  96: res3_2_branch2a_bn_b loaded from weights file into gpu_0/res3_2_branch2a_bn_b: (128,)
INFO net.py:  96: res3_2_branch2b_w loaded from weights file into gpu_0/res3_2_branch2b_w: (128, 128, 3, 3)
INFO net.py:  96: res3_2_branch2b_bn_s loaded from weights file into gpu_0/res3_2_branch2b_bn_s: (128,)
INFO net.py:  96: res3_2_branch2b_bn_b loaded from weights file into gpu_0/res3_2_branch2b_bn_b: (128,)
INFO net.py:  96: res3_2_branch2c_w loaded from weights file into gpu_0/res3_2_branch2c_w: (512, 128, 1, 1)
INFO net.py:  96: res3_2_branch2c_bn_s loaded from weights file into gpu_0/res3_2_branch2c_bn_s: (512,)
INFO net.py:  96: res3_2_branch2c_bn_b loaded from weights file into gpu_0/res3_2_branch2c_bn_b: (512,)
INFO net.py:  96: res3_3_branch2a_w loaded from weights file into gpu_0/res3_3_branch2a_w: (128, 512, 1, 1)
INFO net.py:  96: res3_3_branch2a_bn_s loaded from weights file into gpu_0/res3_3_branch2a_bn_s: (128,)
INFO net.py:  96: res3_3_branch2a_bn_b loaded from weights file into gpu_0/res3_3_branch2a_bn_b: (128,)
INFO net.py:  96: res3_3_branch2b_w loaded from weights file into gpu_0/res3_3_branch2b_w: (128, 128, 3, 3)
INFO net.py:  96: res3_3_branch2b_bn_s loaded from weights file into gpu_0/res3_3_branch2b_bn_s: (128,)
INFO net.py:  96: res3_3_branch2b_bn_b loaded from weights file into gpu_0/res3_3_branch2b_bn_b: (128,)
INFO net.py:  96: res3_3_branch2c_w loaded from weights file into gpu_0/res3_3_branch2c_w: (512, 128, 1, 1)
INFO net.py:  96: res3_3_branch2c_bn_s loaded from weights file into gpu_0/res3_3_branch2c_bn_s: (512,)
INFO net.py:  96: res3_3_branch2c_bn_b loaded from weights file into gpu_0/res3_3_branch2c_bn_b: (512,)
INFO net.py:  96: res4_0_branch2a_w loaded from weights file into gpu_0/res4_0_branch2a_w: (256, 512, 1, 1)
INFO net.py:  96: res4_0_branch2a_bn_s loaded from weights file into gpu_0/res4_0_branch2a_bn_s: (256,)
INFO net.py:  96: res4_0_branch2a_bn_b loaded from weights file into gpu_0/res4_0_branch2a_bn_b: (256,)
INFO net.py:  96: res4_0_branch2b_w loaded from weights file into gpu_0/res4_0_branch2b_w: (256, 256, 3, 3)
INFO net.py:  96: res4_0_branch2b_bn_s loaded from weights file into gpu_0/res4_0_branch2b_bn_s: (256,)
INFO net.py:  96: res4_0_branch2b_bn_b loaded from weights file into gpu_0/res4_0_branch2b_bn_b: (256,)
INFO net.py:  96: res4_0_branch2c_w loaded from weights file into gpu_0/res4_0_branch2c_w: (1024, 256, 1, 1)
INFO net.py:  96: res4_0_branch2c_bn_s loaded from weights file into gpu_0/res4_0_branch2c_bn_s: (1024,)
INFO net.py:  96: res4_0_branch2c_bn_b loaded from weights file into gpu_0/res4_0_branch2c_bn_b: (1024,)
INFO net.py:  96: res4_0_branch1_w loaded from weights file into gpu_0/res4_0_branch1_w: (1024, 512, 1, 1)
INFO net.py:  96: res4_0_branch1_bn_s loaded from weights file into gpu_0/res4_0_branch1_bn_s: (1024,)
INFO net.py:  96: res4_0_branch1_bn_b loaded from weights file into gpu_0/res4_0_branch1_bn_b: (1024,)
INFO net.py:  96: res4_1_branch2a_w loaded from weights file into gpu_0/res4_1_branch2a_w: (256, 1024, 1, 1)
INFO net.py:  96: res4_1_branch2a_bn_s loaded from weights file into gpu_0/res4_1_branch2a_bn_s: (256,)
INFO net.py:  96: res4_1_branch2a_bn_b loaded from weights file into gpu_0/res4_1_branch2a_bn_b: (256,)
INFO net.py:  96: res4_1_branch2b_w loaded from weights file into gpu_0/res4_1_branch2b_w: (256, 256, 3, 3)
INFO net.py:  96: res4_1_branch2b_bn_s loaded from weights file into gpu_0/res4_1_branch2b_bn_s: (256,)
INFO net.py:  96: res4_1_branch2b_bn_b loaded from weights file into gpu_0/res4_1_branch2b_bn_b: (256,)
INFO net.py:  96: res4_1_branch2c_w loaded from weights file into gpu_0/res4_1_branch2c_w: (1024, 256, 1, 1)
INFO net.py:  96: res4_1_branch2c_bn_s loaded from weights file into gpu_0/res4_1_branch2c_bn_s: (1024,)
INFO net.py:  96: res4_1_branch2c_bn_b loaded from weights file into gpu_0/res4_1_branch2c_bn_b: (1024,)
INFO net.py:  96: res4_2_branch2a_w loaded from weights file into gpu_0/res4_2_branch2a_w: (256, 1024, 1, 1)
INFO net.py:  96: res4_2_branch2a_bn_s loaded from weights file into gpu_0/res4_2_branch2a_bn_s: (256,)
INFO net.py:  96: res4_2_branch2a_bn_b loaded from weights file into gpu_0/res4_2_branch2a_bn_b: (256,)
INFO net.py:  96: res4_2_branch2b_w loaded from weights file into gpu_0/res4_2_branch2b_w: (256, 256, 3, 3)
INFO net.py:  96: res4_2_branch2b_bn_s loaded from weights file into gpu_0/res4_2_branch2b_bn_s: (256,)
INFO net.py:  96: res4_2_branch2b_bn_b loaded from weights file into gpu_0/res4_2_branch2b_bn_b: (256,)
INFO net.py:  96: res4_2_branch2c_w loaded from weights file into gpu_0/res4_2_branch2c_w: (1024, 256, 1, 1)
INFO net.py:  96: res4_2_branch2c_bn_s loaded from weights file into gpu_0/res4_2_branch2c_bn_s: (1024,)
INFO net.py:  96: res4_2_branch2c_bn_b loaded from weights file into gpu_0/res4_2_branch2c_bn_b: (1024,)
INFO net.py:  96: res4_3_branch2a_w loaded from weights file into gpu_0/res4_3_branch2a_w: (256, 1024, 1, 1)
INFO net.py:  96: res4_3_branch2a_bn_s loaded from weights file into gpu_0/res4_3_branch2a_bn_s: (256,)
INFO net.py:  96: res4_3_branch2a_bn_b loaded from weights file into gpu_0/res4_3_branch2a_bn_b: (256,)
INFO net.py:  96: res4_3_branch2b_w loaded from weights file into gpu_0/res4_3_branch2b_w: (256, 256, 3, 3)
INFO net.py:  96: res4_3_branch2b_bn_s loaded from weights file into gpu_0/res4_3_branch2b_bn_s: (256,)
INFO net.py:  96: res4_3_branch2b_bn_b loaded from weights file into gpu_0/res4_3_branch2b_bn_b: (256,)
INFO net.py:  96: res4_3_branch2c_w loaded from weights file into gpu_0/res4_3_branch2c_w: (1024, 256, 1, 1)
INFO net.py:  96: res4_3_branch2c_bn_s loaded from weights file into gpu_0/res4_3_branch2c_bn_s: (1024,)
INFO net.py:  96: res4_3_branch2c_bn_b loaded from weights file into gpu_0/res4_3_branch2c_bn_b: (1024,)
INFO net.py:  96: res4_4_branch2a_w loaded from weights file into gpu_0/res4_4_branch2a_w: (256, 1024, 1, 1)
INFO net.py:  96: res4_4_branch2a_bn_s loaded from weights file into gpu_0/res4_4_branch2a_bn_s: (256,)
INFO net.py:  96: res4_4_branch2a_bn_b loaded from weights file into gpu_0/res4_4_branch2a_bn_b: (256,)
INFO net.py:  96: res4_4_branch2b_w loaded from weights file into gpu_0/res4_4_branch2b_w: (256, 256, 3, 3)
INFO net.py:  96: res4_4_branch2b_bn_s loaded from weights file into gpu_0/res4_4_branch2b_bn_s: (256,)
INFO net.py:  96: res4_4_branch2b_bn_b loaded from weights file into gpu_0/res4_4_branch2b_bn_b: (256,)
INFO net.py:  96: res4_4_branch2c_w loaded from weights file into gpu_0/res4_4_branch2c_w: (1024, 256, 1, 1)
INFO net.py:  96: res4_4_branch2c_bn_s loaded from weights file into gpu_0/res4_4_branch2c_bn_s: (1024,)
INFO net.py:  96: res4_4_branch2c_bn_b loaded from weights file into gpu_0/res4_4_branch2c_bn_b: (1024,)
INFO net.py:  96: res4_5_branch2a_w loaded from weights file into gpu_0/res4_5_branch2a_w: (256, 1024, 1, 1)
INFO net.py:  96: res4_5_branch2a_bn_s loaded from weights file into gpu_0/res4_5_branch2a_bn_s: (256,)
INFO net.py:  96: res4_5_branch2a_bn_b loaded from weights file into gpu_0/res4_5_branch2a_bn_b: (256,)
INFO net.py:  96: res4_5_branch2b_w loaded from weights file into gpu_0/res4_5_branch2b_w: (256, 256, 3, 3)
INFO net.py:  96: res4_5_branch2b_bn_s loaded from weights file into gpu_0/res4_5_branch2b_bn_s: (256,)
INFO net.py:  96: res4_5_branch2b_bn_b loaded from weights file into gpu_0/res4_5_branch2b_bn_b: (256,)
INFO net.py:  96: res4_5_branch2c_w loaded from weights file into gpu_0/res4_5_branch2c_w: (1024, 256, 1, 1)
INFO net.py:  96: res4_5_branch2c_bn_s loaded from weights file into gpu_0/res4_5_branch2c_bn_s: (1024,)
INFO net.py:  96: res4_5_branch2c_bn_b loaded from weights file into gpu_0/res4_5_branch2c_bn_b: (1024,)
INFO net.py:  96: res5_0_branch2a_w loaded from weights file into gpu_0/res5_0_branch2a_w: (512, 1024, 1, 1)
INFO net.py:  96: res5_0_branch2a_bn_s loaded from weights file into gpu_0/res5_0_branch2a_bn_s: (512,)
INFO net.py:  96: res5_0_branch2a_bn_b loaded from weights file into gpu_0/res5_0_branch2a_bn_b: (512,)
INFO net.py:  96: res5_0_branch2b_w loaded from weights file into gpu_0/res5_0_branch2b_w: (512, 512, 3, 3)
INFO net.py:  96: res5_0_branch2b_bn_s loaded from weights file into gpu_0/res5_0_branch2b_bn_s: (512,)
INFO net.py:  96: res5_0_branch2b_bn_b loaded from weights file into gpu_0/res5_0_branch2b_bn_b: (512,)
INFO net.py:  96: res5_0_branch2c_w loaded from weights file into gpu_0/res5_0_branch2c_w: (2048, 512, 1, 1)
INFO net.py:  96: res5_0_branch2c_bn_s loaded from weights file into gpu_0/res5_0_branch2c_bn_s: (2048,)
INFO net.py:  96: res5_0_branch2c_bn_b loaded from weights file into gpu_0/res5_0_branch2c_bn_b: (2048,)
INFO net.py:  96: res5_0_branch1_w loaded from weights file into gpu_0/res5_0_branch1_w: (2048, 1024, 1, 1)
INFO net.py:  96: res5_0_branch1_bn_s loaded from weights file into gpu_0/res5_0_branch1_bn_s: (2048,)
INFO net.py:  96: res5_0_branch1_bn_b loaded from weights file into gpu_0/res5_0_branch1_bn_b: (2048,)
INFO net.py:  96: res5_1_branch2a_w loaded from weights file into gpu_0/res5_1_branch2a_w: (512, 2048, 1, 1)
INFO net.py:  96: res5_1_branch2a_bn_s loaded from weights file into gpu_0/res5_1_branch2a_bn_s: (512,)
INFO net.py:  96: res5_1_branch2a_bn_b loaded from weights file into gpu_0/res5_1_branch2a_bn_b: (512,)
INFO net.py:  96: res5_1_branch2b_w loaded from weights file into gpu_0/res5_1_branch2b_w: (512, 512, 3, 3)
INFO net.py:  96: res5_1_branch2b_bn_s loaded from weights file into gpu_0/res5_1_branch2b_bn_s: (512,)
INFO net.py:  96: res5_1_branch2b_bn_b loaded from weights file into gpu_0/res5_1_branch2b_bn_b: (512,)
INFO net.py:  96: res5_1_branch2c_w loaded from weights file into gpu_0/res5_1_branch2c_w: (2048, 512, 1, 1)
INFO net.py:  96: res5_1_branch2c_bn_s loaded from weights file into gpu_0/res5_1_branch2c_bn_s: (2048,)
INFO net.py:  96: res5_1_branch2c_bn_b loaded from weights file into gpu_0/res5_1_branch2c_bn_b: (2048,)
INFO net.py:  96: res5_2_branch2a_w loaded from weights file into gpu_0/res5_2_branch2a_w: (512, 2048, 1, 1)
INFO net.py:  96: res5_2_branch2a_bn_s loaded from weights file into gpu_0/res5_2_branch2a_bn_s: (512,)
INFO net.py:  96: res5_2_branch2a_bn_b loaded from weights file into gpu_0/res5_2_branch2a_bn_b: (512,)
INFO net.py:  96: res5_2_branch2b_w loaded from weights file into gpu_0/res5_2_branch2b_w: (512, 512, 3, 3)
INFO net.py:  96: res5_2_branch2b_bn_s loaded from weights file into gpu_0/res5_2_branch2b_bn_s: (512,)
INFO net.py:  96: res5_2_branch2b_bn_b loaded from weights file into gpu_0/res5_2_branch2b_bn_b: (512,)
INFO net.py:  96: res5_2_branch2c_w loaded from weights file into gpu_0/res5_2_branch2c_w: (2048, 512, 1, 1)
INFO net.py:  96: res5_2_branch2c_bn_s loaded from weights file into gpu_0/res5_2_branch2c_bn_s: (2048,)
INFO net.py:  96: res5_2_branch2c_bn_b loaded from weights file into gpu_0/res5_2_branch2c_bn_b: (2048,)
INFO net.py:  89: fpn_inner_res5_2_sum_w not found
INFO net.py:  89: fpn_inner_res5_2_sum_b not found
INFO net.py:  89: fpn_inner_res4_5_sum_lateral_w not found
INFO net.py:  89: fpn_inner_res4_5_sum_lateral_b not found
INFO net.py:  89: fpn_inner_res3_3_sum_lateral_w not found
INFO net.py:  89: fpn_inner_res3_3_sum_lateral_b not found
INFO net.py:  89: fpn_inner_res2_2_sum_lateral_w not found
INFO net.py:  89: fpn_inner_res2_2_sum_lateral_b not found
INFO net.py:  89: fpn_res5_2_sum_w not found
INFO net.py:  89: fpn_res5_2_sum_b not found
INFO net.py:  89: fpn_res4_5_sum_w not found
INFO net.py:  89: fpn_res4_5_sum_b not found
INFO net.py:  89: fpn_res3_3_sum_w not found
INFO net.py:  89: fpn_res3_3_sum_b not found
INFO net.py:  89: fpn_res2_2_sum_w not found
INFO net.py:  89: fpn_res2_2_sum_b not found
INFO net.py:  89: conv_rpn_fpn2_w not found
INFO net.py:  89: conv_rpn_fpn2_b not found
INFO net.py:  89: rpn_cls_logits_fpn2_w not found
INFO net.py:  89: rpn_cls_logits_fpn2_b not found
INFO net.py:  89: rpn_bbox_pred_fpn2_w not found
INFO net.py:  89: rpn_bbox_pred_fpn2_b not found
INFO net.py:  89: fc6_w not found
INFO net.py:  89: fc6_b not found
INFO net.py:  89: fc7_w not found
INFO net.py:  89: fc7_b not found
INFO net.py:  89: cls_score_w not found
INFO net.py:  89: cls_score_b not found
INFO net.py:  89: bbox_pred_w not found
INFO net.py:  89: bbox_pred_b not found
INFO net.py:  89: fcn1_w not found
INFO net.py:  89: fcn1_b not found
INFO net.py:  89: fcn2_w not found
INFO net.py:  89: fcn2_b not found
INFO net.py:  89: fcn3_w not found
INFO net.py:  89: fcn3_b not found
INFO net.py:  89: fcn4_w not found
INFO net.py:  89: fcn4_b not found
INFO net.py:  89: conv5_mask_w not found
INFO net.py:  89: conv5_mask_b not found
INFO net.py:  89: mask_fcn_logits_w not found
INFO net.py:  89: mask_fcn_logits_b not found
INFO net.py: 133: res2_1_branch2c_b preserved in workspace (unused)
INFO net.py: 133: res3_1_branch2b_b preserved in workspace (unused)
INFO net.py: 133: res4_2_branch2c_b preserved in workspace (unused)
INFO net.py: 133: res4_2_branch2a_b preserved in workspace (unused)
INFO net.py: 133: res2_2_branch2a_b preserved in workspace (unused)
INFO net.py: 133: res3_3_branch2a_b preserved in workspace (unused)
INFO net.py: 133: res5_1_branch2b_b preserved in workspace (unused)
INFO net.py: 133: res3_3_branch2c_b preserved in workspace (unused)
INFO net.py: 133: res4_4_branch2b_b preserved in workspace (unused)
INFO net.py: 133: conv1_b preserved in workspace (unused)
INFO net.py: 133: fc1000_b preserved in workspace (unused)
INFO net.py: 133: fc1000_w preserved in workspace (unused)
INFO net.py: 133: res3_2_branch2c_b preserved in workspace (unused)
INFO net.py: 133: res3_2_branch2a_b preserved in workspace (unused)
INFO net.py: 133: res4_3_branch2b_b preserved in workspace (unused)
INFO net.py: 133: res2_0_branch1_b preserved in workspace (unused)
INFO net.py: 133: res5_0_branch2b_b preserved in workspace (unused)
INFO net.py: 133: res4_5_branch2a_b preserved in workspace (unused)
INFO net.py: 133: res4_0_branch2b_b preserved in workspace (unused)
INFO net.py: 133: res2_0_branch2c_b preserved in workspace (unused)
INFO net.py: 133: res2_1_branch2a_b preserved in workspace (unused)
INFO net.py: 133: res4_0_branch1_b preserved in workspace (unused)
INFO net.py: 133: res2_2_branch2c_b preserved in workspace (unused)
INFO net.py: 133: res4_3_branch2a_b preserved in workspace (unused)
INFO net.py: 133: res3_2_branch2b_b preserved in workspace (unused)
INFO net.py: 133: res4_5_branch2b_b preserved in workspace (unused)
INFO net.py: 133: res3_0_branch1_b preserved in workspace (unused)
INFO net.py: 133: res2_0_branch2b_b preserved in workspace (unused)
INFO net.py: 133: res4_1_branch2c_b preserved in workspace (unused)
INFO net.py: 133: res4_0_branch2a_b preserved in workspace (unused)
INFO net.py: 133: res4_1_branch2a_b preserved in workspace (unused)
INFO net.py: 133: res4_0_branch2c_b preserved in workspace (unused)
INFO net.py: 133: res5_2_branch2b_b preserved in workspace (unused)
INFO net.py: 133: res4_5_branch2c_b preserved in workspace (unused)
INFO net.py: 133: res4_2_branch2b_b preserved in workspace (unused)
INFO net.py: 133: res2_1_branch2b_b preserved in workspace (unused)
INFO net.py: 133: res3_1_branch2c_b preserved in workspace (unused)
INFO net.py: 133: res3_0_branch2b_b preserved in workspace (unused)
INFO net.py: 133: res2_2_branch2b_b preserved in workspace (unused)
INFO net.py: 133: res3_1_branch2a_b preserved in workspace (unused)
INFO net.py: 133: res5_1_branch2a_b preserved in workspace (unused)
INFO net.py: 133: res5_1_branch2c_b preserved in workspace (unused)
INFO net.py: 133: res4_4_branch2a_b preserved in workspace (unused)
INFO net.py: 133: res5_2_branch2c_b preserved in workspace (unused)
INFO net.py: 133: res2_0_branch2a_b preserved in workspace (unused)
INFO net.py: 133: res3_3_branch2b_b preserved in workspace (unused)
INFO net.py: 133: res4_1_branch2b_b preserved in workspace (unused)
INFO net.py: 133: res4_4_branch2c_b preserved in workspace (unused)
INFO net.py: 133: res5_0_branch2c_b preserved in workspace (unused)
INFO net.py: 133: res5_2_branch2a_b preserved in workspace (unused)
INFO net.py: 133: res5_0_branch2a_b preserved in workspace (unused)
INFO net.py: 133: res3_0_branch2a_b preserved in workspace (unused)
INFO net.py: 133: res5_0_branch1_b preserved in workspace (unused)
INFO net.py: 133: res3_0_branch2c_b preserved in workspace (unused)
INFO net.py: 133: res4_3_branch2c_b preserved in workspace (unused)
[I net_dag_utils.cc:102] Operator graph pruning prior to chain compute took: 0.000174861 secs
INFO train.py: 180: Outputs saved to: /home/gengfeng/Desktop/projects/DETECTRON/out_dir/train/labelme_train/generalized_rcnn
INFO loader.py: 230: Pre-filling mini-batch queue...
INFO loader.py: 235:   [0/64]
INFO loader.py: 235:   [0/64]
INFO loader.py: 235:   [0/64]
INFO loader.py: 235:   [0/64]
INFO loader.py: 235:   [1/64]
[I context_gpu.cu:318] GPU 0: 449 MB
[I context_gpu.cu:322] Total: 449 MB
INFO loader.py: 235:   [1/64]
[I context_gpu.cu:318] GPU 0: 586 MB
[I context_gpu.cu:322] Total: 586 MB
INFO loader.py: 235:   [0/64]
INFO loader.py: 235:   [0/64]
[I context_gpu.cu:318] GPU 0: 723 MB
[I context_gpu.cu:322] Total: 723 MB
INFO loader.py: 235:   [0/64]
[I context_gpu.cu:318] GPU 0: 859 MB
[I context_gpu.cu:322] Total: 859 MB
INFO loader.py: 235:   [0/64]
INFO loader.py: 235:   [3/64]
INFO loader.py: 235:   [4/64]
INFO loader.py: 235:   [4/64]
INFO loader.py: 235:   [6/64]
INFO loader.py: 235:   [8/64]
INFO loader.py: 235:   [9/64]
INFO loader.py: 235:   [10/64]
INFO loader.py: 235:   [13/64]
INFO loader.py: 235:   [13/64]
INFO loader.py: 235:   [14/64]
INFO loader.py: 235:   [17/64]
INFO loader.py: 235:   [18/64]
INFO loader.py: 235:   [20/64]
INFO loader.py: 235:   [22/64]
INFO loader.py: 235:   [24/64]
INFO loader.py: 235:   [26/64]
INFO loader.py: 235:   [27/64]
INFO loader.py: 235:   [28/64]
INFO loader.py: 235:   [30/64]
INFO loader.py: 235:   [31/64]
INFO loader.py: 235:   [32/64]
INFO loader.py: 235:   [34/64]
INFO loader.py: 235:   [36/64]
INFO loader.py: 235:   [37/64]
INFO loader.py: 235:   [38/64]
INFO loader.py: 235:   [40/64]
INFO loader.py: 235:   [42/64]
INFO loader.py: 235:   [43/64]
INFO loader.py: 235:   [45/64]
INFO loader.py: 235:   [48/64]
INFO loader.py: 235:   [48/64]
INFO loader.py: 235:   [49/64]
INFO loader.py: 235:   [52/64]
INFO loader.py: 235:   [52/64]
INFO loader.py: 235:   [55/64]
INFO loader.py: 235:   [56/64]
INFO loader.py: 235:   [57/64]
INFO loader.py: 235:   [59/64]
INFO loader.py: 235:   [60/64]
INFO loader.py: 235:   [61/64]
INFO loader.py: 235:   [63/64]
INFO detector.py: 479: Changing learning rate 0.000000 -> 0.006667 at iter 0
[I net_async_base.h:198] Using specified CPU pool size: 4; NUMA node id: -1
[I net_async_base.h:203] Created new CPU pool, size: 4; NUMA node id: -1
[I context_gpu.cu:318] GPU 0: 1075 MB
[I context_gpu.cu:322] Total: 1075 MB
[I context_gpu.cu:318] GPU 0: 1324 MB
[I context_gpu.cu:322] Total: 1324 MB
[I context_gpu.cu:318] GPU 0: 1576 MB
[I context_gpu.cu:322] Total: 1576 MB
[I context_gpu.cu:318] GPU 0: 1712 MB
[I context_gpu.cu:322] Total: 1712 MB
[I context_gpu.cu:318] GPU 0: 1901 MB
[I context_gpu.cu:322] Total: 1901 MB
[I context_gpu.cu:318] GPU 0: 2058 MB
[I context_gpu.cu:322] Total: 2058 MB
[I context_gpu.cu:318] GPU 0: 2216 MB
[I context_gpu.cu:322] Total: 2216 MB
[I context_gpu.cu:318] GPU 0: 2468 MB
[I context_gpu.cu:322] Total: 2468 MB
[I context_gpu.cu:318] GPU 0: 2610 MB
[I context_gpu.cu:322] Total: 2610 MB
[I context_gpu.cu:318] GPU 0: 2746 MB
[I context_gpu.cu:322] Total: 2746 MB
[I context_gpu.cu:318] GPU 0: 2888 MB
[I context_gpu.cu:322] Total: 2888 MB
[I context_gpu.cu:318] GPU 0: 3045 MB
[I context_gpu.cu:322] Total: 3045 MB
[I context_gpu.cu:318] GPU 0: 3203 MB
[I context_gpu.cu:322] Total: 3203 MB
[I context_gpu.cu:318] GPU 0: 3360 MB
[I context_gpu.cu:322] Total: 3360 MB
[I context_gpu.cu:318] GPU 0: 3505 MB
[I context_gpu.cu:322] Total: 3505 MB
[I context_gpu.cu:318] GPU 0: 3636 MB
[I context_gpu.cu:322] Total: 3636 MB
[I context_gpu.cu:318] GPU 0: 3778 MB
[I context_gpu.cu:322] Total: 3778 MB
[I context_gpu.cu:318] GPU 0: 3935 MB
[I context_gpu.cu:322] Total: 3935 MB
[I context_gpu.cu:318] GPU 0: 4065 MB
[I context_gpu.cu:322] Total: 4065 MB
[I context_gpu.cu:318] GPU 0: 4199 MB
[I context_gpu.cu:322] Total: 4199 MB
[I context_gpu.cu:318] GPU 0: 4333 MB
[I context_gpu.cu:322] Total: 4333 MB
[I context_gpu.cu:318] GPU 0: 4462 MB
[I context_gpu.cu:322] Total: 4462 MB
[I context_gpu.cu:318] GPU 0: 4622 MB
[I context_gpu.cu:322] Total: 4622 MB
[I context_gpu.cu:318] GPU 0: 4874 MB
[I context_gpu.cu:322] Total: 4874 MB
[I context_gpu.cu:318] GPU 0: 5108 MB
[I context_gpu.cu:322] Total: 5108 MB
[I context_gpu.cu:318] GPU 0: 5292 MB
[I context_gpu.cu:322] Total: 5292 MB
[I context_gpu.cu:318] GPU 0: 5451 MB
[I context_gpu.cu:322] Total: 5451 MB
[I context_gpu.cu:318] GPU 0: 5688 MB
[I context_gpu.cu:322] Total: 5688 MB
[I context_gpu.cu:318] GPU 0: 5920 MB
[I context_gpu.cu:322] Total: 5920 MB
[I context_gpu.cu:318] GPU 0: 6096 MB
[I context_gpu.cu:322] Total: 6096 MB
[I context_gpu.cu:318] GPU 0: 6228 MB
[I context_gpu.cu:322] Total: 6228 MB
[I context_gpu.cu:318] GPU 0: 6363 MB
[I context_gpu.cu:322] Total: 6363 MB
[E net_async_base.cc:422] [enforce fail at context_gpu.cu:343] error == cudaSuccess. 2 vs 0. Error at: /opt/conda/conda-bld/pytorch-nightly_1539602533843/work/caffe2/core/context_gpu.cu:343: out of memoryError from operator: 
input: "gpu_0/__m16_shared" input: "gpu_0/res4_5_branch2c_bn_s" output: "gpu_0/__m6_shared" name: "" type: "AffineChannelGradient" device_option { device_type: 1 device_id: 0 } is_gradient_op: true,  op AffineChannelGradient
WARNING workspace.py: 187: Original python traceback for operator `422` in network `generalized_rcnn` in exception above (most recent call last):
Traceback (most recent call last):
  File "tools/train_net.py", line 132, in <module>
    main()
  File "tools/train_net.py", line 114, in main
    checkpoints = detectron.utils.train.train_model()
  File "/home/gengfeng/Desktop/projects/DETECTRON/detectron/utils/train.py", line 67, in train_model
    workspace.RunNet(model.net.Proto().name)
  File "/home/gengfeng/anaconda3/lib/python3.6/site-packages/caffe2/python/workspace.py", line 219, in RunNet
    StringifyNetName(name), num_iter, allow_fail,
  File "/home/gengfeng/anaconda3/lib/python3.6/site-packages/caffe2/python/workspace.py", line 180, in CallWithExceptionIntercept
    return func(*args, **kwargs)
RuntimeError: [enforce fail at context_gpu.cu:343] error == cudaSuccess. 2 vs 0. Error at: /opt/conda/conda-bld/pytorch-nightly_1539602533843/work/caffe2/core/context_gpu.cu:343: out of memoryError from operator: 
input: "gpu_0/__m16_shared" input: "gpu_0/res4_5_branch2c_bn_s" output: "gpu_0/__m6_shared" name: "" type: "AffineChannelGradient" device_option { device_type: 1 device_id: 0 } is_gradient_op: true

My GPU is GTX-1080 8G memory, if it's because of the GPU capacity, where should i decrease the batch_size from 64 to 32(i have tried modify in *.yaml and loader.py files but it took no effect)? Or the batch_size is fixed to 64? waiting for your response, thanks in advance.

StephenLau007 commented 6 years ago

I also meat this problem.I just run on 8 GeForce GTX 1080 Ti GPU.Although I config 8 GPU in yml,the infer_simple.py just run on 4 GPU.And then occur the Out Of Memory problem.

gf19880710 commented 6 years ago

@StephenLU0422, I tried to modify some parameters in .yaml config file, and this issue disappeared, I just modified SCALES parameter under TRAIN and TEST domain from 800 to 700. Hope this can help you.

StephenLau007 commented 6 years ago

@gf19880710 Thank you!I found the problem.Because my 0-3 GPUs also run other job,so the GPU memory is not enough to run the infer_simple.py.I just change the GPU number.

sudo NV_GPU=4,5,6,7 nvidia-docker run --rm -it --network=host detectron:c2-cuda9-cudnn7 bash

jungaria commented 5 years ago

@gf19880710 Your comments worked.

When i ran infer_simple.py with options, --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml --output-dir ./test/imageDetection --image-ext jpg --wts trainedWeights/model_final.pkl demo, i got "out_of memor" error.

i changed SCALES/SCALE values in e2e_mask_rcnn_R-101-FPN_2x.yaml from 800 to 300 ( actually 700 made same error )

i am going to check the meaning of the value( SCALES/SCALE )

Thanks

minan19605 commented 5 years ago

This is also work for me. I only have one RTX 2070. I am using Ubuntu 18.04 LTS, Cuda 10.1 Cudnn 7. I use script: python tools/train_net.py \ --cfg configs/04_2018_gn_baselines/e2e_mask_rcnn_R-101-FPN_2x_gn_1gpu.yaml \ OUTPUT_DIR ../coco/detectron-output

I modify the SCALES from 800 to 300 and BATCH_SIZE_PER_IM from 512 to 256. Note: the original yaml file is for multiple GPUs. I modify this yaml to set GPU number as 1 and rename the file name

minan19605 commented 5 years ago

Add more comments: I run the train.py about 3-4 hours, the memory run out again. So, I need tune those parameters.

facebookresearch / Detectron

Error at: caffe2/core/context_gpu.cu:343: out of memory #5