facebookresearch / Detectron

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.
Apache License 2.0
26.27k stars 5.45k forks source link

Detects only one class #483

Open srikanth-kilaru opened 6 years ago

srikanth-kilaru commented 6 years ago

With a InceptionResnet backbone, I was able to detect only one class and one instance. I had to turn off FPN as I was getting a blob dimension mismatch. Could this be the reason. Thanks

MODEL: TYPE: generalized_rcnn CONV_BODY: Inception_ResNetv2.add_inception_resnetv2_xxs_conv5_body NUM_CLASSES: 81 FASTER_RCNN: True MASK_ON: True # See if this resolves the inference issue NUM_GPUS: 2 SOLVER: WEIGHT_DECAY: 0.0001 LR_POLICY: steps_with_decay BASE_LR: 0.005 GAMMA: 0.1 MAX_ITER: 30000 STEPS: [0, 15000, 20000]

Equivalent schedules with...

1 GPU:

BASE_LR: 0.0025

MAX_ITER: 60000

STEPS: [0, 30000, 40000]

2 GPUs:

BASE_LR: 0.005

MAX_ITER: 30000

STEPS: [0, 15000, 20000]

4 GPUs:

BASE_LR: 0.01

MAX_ITER: 15000

STEPS: [0, 7500, 10000]

8 GPUs:

BASE_LR: 0.02

MAX_ITER: 7500

STEPS: [0, 3750, 5000]

FPN: FPN_ON: False #True changing for compilation error MULTILEVEL_ROIS: True MULTILEVEL_RPN: True USE_GN: True # Note: use GN on the FPN-specific layers FAST_RCNN: ROI_BOX_HEAD: fast_rcnn_heads.add_roi_Xconv1fc_gn_head # Note: this is a Conv GN head ROI_XFORM_METHOD: RoIAlign ROI_XFORM_RESOLUTION: 7 ROI_XFORM_SAMPLING_RATIO: 2 MRCNN: ROI_MASK_HEAD: mask_rcnn_heads.mask_rcnn_fcn_head_v1up4convs_gn # Note: this is a GN mask head RESOLUTION: 28 # (output mask resolution) default 14 ROI_XFORM_METHOD: RoIAlign ROI_XFORM_RESOLUTION: 14 # default 7 ROI_XFORM_SAMPLING_RATIO: 2 # default 0 DILATION: 1 # default 2 CONV_INIT: MSRAFill # default GaussianFill TRAIN:

WEIGHTS: N/A

DATASETS: ('coco_2014_train', 'coco_2014_valminusminival') SCALES: (500,) MAX_SIZE: 833 BATCH_SIZE_PER_IM: 256 RPN_PRE_NMS_TOP_N: 2000 # Per FPN level TEST: DATASETS: ('coco_2014_minival',) SCALE: 500 MAX_SIZE: 833 NMS: 0.5 RPN_PRE_NMS_TOP_N: 1000 # Per FPN level RPN_POST_NMS_TOP_N: 1000 OUTPUT_DIR: .

ir413 commented 6 years ago

Hi @srikanth-kilaru, sorry but I don't understand your question. Could you please clarify?

srikanth-kilaru commented 6 years ago

Ilja I was trying out Detectron with a new conv backbone (Inception_ResNetv2). I had to turn off FPN in order for the training to go through successfully (more on that later). After successfull training, during inception I noticed that it was not producing any .pdf files for some images and for others it only produced inference for 'person' class. Later I changed the 'thresh' setting in infer.py to 0.3 from 0.7 and that helped visualize a few more classes. However the accuracy of classification or segmentation was no where near what one sees with the stock ResNet50/ResNet101 backbone and FPN on.

I am not sure if this low performance and in detection and segmentation is due to the FPN having to be turned off. I also tried to add FPN to the backbone by implementing similar functions such as def add_fpn_Inception_conv5_body(model): But I ran into core dump because of blob dimension check failure. Could you please help me identify what could be the possible causes for the low performance, and if it is due to FPN being off, then any help in bypassing the build errors I get during training with FPN ON would help a lot. Please see the attached config YAML file.

MODEL: TYPE: generalized_rcnn CONV_BODY: Inception_ResNetv2.add_inception_resnetv2_xxs_conv5_body NUM_CLASSES: 81 FASTER_RCNN: True MASK_ON: True # See if this resolves the inference issue NUM_GPUS: 2 SOLVER: WEIGHT_DECAY: 0.0001 LR_POLICY: steps_with_decay BASE_LR: 0.005 GAMMA: 0.1 MAX_ITER: 30000 STEPS: [0, 15000, 20000]

Equivalent schedules with...

1 GPU:

BASE_LR: 0.0025

MAX_ITER: 60000

STEPS: [0, 30000, 40000]

2 GPUs:

BASE_LR: 0.005

MAX_ITER: 30000

STEPS: [0, 15000, 20000]

4 GPUs:

BASE_LR: 0.01

MAX_ITER: 15000

STEPS: [0, 7500, 10000]

8 GPUs:

BASE_LR: 0.02

MAX_ITER: 7500

STEPS: [0, 3750, 5000]

FPN: FPN_ON: False MULTILEVEL_ROIS: True MULTILEVEL_RPN: True USE_GN: True # Note: use GN on the FPN-specific layers FAST_RCNN: ROI_BOX_HEAD: fast_rcnn_heads.add_roi_Xconv1fc_gn_head # Note: this is a Conv GN head ROI_XFORM_METHOD: RoIAlign ROI_XFORM_RESOLUTION: 7 ROI_XFORM_SAMPLING_RATIO: 2 MRCNN: ROI_MASK_HEAD: mask_rcnn_heads.mask_rcnn_fcn_head_v1up4convs_gn # Note: this is a GN mask head RESOLUTION: 28 # (output mask resolution) default 14 ROI_XFORM_METHOD: RoIAlign ROI_XFORM_RESOLUTION: 14 # default 7 ROI_XFORM_SAMPLING_RATIO: 2 # default 0 DILATION: 1 # default 2 CONV_INIT: MSRAFill # default GaussianFill TRAIN:

WEIGHTS: N/A

DATASETS: ('coco_2014_train', 'coco_2014_valminusminival') SCALES: (500,) MAX_SIZE: 833 BATCH_SIZE_PER_IM: 256 RPN_PRE_NMS_TOP_N: 2000 # Per FPN level TEST: DATASETS: ('coco_2014_minival',) SCALE: 500 MAX_SIZE: 833 NMS: 0.5 RPN_PRE_NMS_TOP_N: 1000 # Per FPN level RPN_POST_NMS_TOP_N: 1000 OUTPUT_DIR: .

Thanks much

On Mon, Jun 11, 2018 at 11:21 AM, Ilija Radosavovic < notifications@github.com> wrote:

Hi @srikanth-kilaru https://github.com/srikanth-kilaru, sorry but I don't understand your question. Could you please clarify?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/facebookresearch/Detectron/issues/483#issuecomment-396301522, or mute the thread https://github.com/notifications/unsubscribe-auth/AeTRytxRK7AOxZMkOvclwmcvSWZbG-45ks5t7pkjgaJpZM4UhGTE .

--

srikanth-kilaru commented 6 years ago

17790319373_bd19b24cfc_k Here's an example of very poor inference on a image previously infered well with ResNet backbone and FPN ON

liuliu66 commented 6 years ago

@srikanth-kilaru Hi, could you please share your Inception-Resnet conv body build file? And do you mean when you train it on one class it would be fine?

johannathiemich commented 6 years ago

I appear to run into the same problem with the only difference that my inference is even worse and using Squeezenet backbone. I also cannot get FPN to work, so I turned it off as well. Has somebody found a solution to this problem yet?

liuliu66 commented 6 years ago

@johannathiemich Hi, if you met the issue of blob channel mismatch, you could check the file in detectron/roi_data/rpn.py. In the add_rpn_blobs function, field stride is defined for roi_data rpn blobs in different FPN level. That means if your FPN level scales are (1/32, 1/16, 1/8, 1/4) which is same as resnet50, the code is correct. or you need to change the field stride to match your FPN level scales in your CNN backbone such as Squeenzenet.

johannathiemich commented 6 years ago

@liuliu66 Thanks for your answer. Right now I am not as concerned about not being able to use the FPN since my net currently does not recognize anything. The few procent of performance that would be increased using FPN are not my main focus right now. I am more worried about training the net so that it performs somewhat acceptable. Or do you think using the FPN with the net is essential to even a mediocre performance?

liuliu66 commented 6 years ago

@johannathiemich Hi, in my work, faster rcnn with FPN could actually bring a significant improvement. This might be caused by that the aim of my work is to detect tiny objects due to multi-scales, you know. So if your network performance is too poor to detect few objects and there is no problem in your dataset (both image quantity, annotation format...), you could try your dataset on the networks that the authors provide like mask-rcnn-resnet50...Because these network have been authorized for their availability rather than your own designed network. Besides, you could save the logs to see where is the problem occurring. Check whether the 4D tensor in each layer is as your expectation. you can also run a public dataset such as coco to see if the log is similar to that in your dataset.

johannathiemich commented 6 years ago

@liuliu66 I am actually working with the COCO Dataset so that also should not be a problem. I actually do have the output from one training run (partly) saved:

INFO train.py: 178: Loading dataset: ('coco_2014_train',)
loading annotations into memory...
Done (t=15.43s)
creating index...
index created!
INFO roidb.py:  49: Appending horizontally-flipped training examples...
INFO roidb.py:  51: Loaded dataset: coco_2014_train
INFO roidb.py: 135: Filtered 1404 roidb entries: 165566 -> 164162
INFO roidb.py:  67: Computing bounding-box regression targets...
INFO roidb.py:  69: done
INFO train.py: 182: 164162 roidb entries
INFO net.py:  59: Loading weights from: /home/************/Detectron/detectron/models/squeezenet/squeezenet_model.pkl
INFO net.py:  88: fire2-squeeze1x1_w not found
INFO net.py:  88: fire2-squeeze1x1_b not found
INFO net.py:  88: fire2-expand1x1_w not found
INFO net.py:  88: fire2-expand1x1_b not found
INFO net.py:  88: fire2-expand3x3_w not found
INFO net.py:  88: fire2-expand3x3_b not found
INFO net.py:  88: fire3-squeeze1x1_w not found
INFO net.py:  88: fire3-squeeze1x1_b not found
INFO net.py:  88: fire3-expand1x1_w not found
INFO net.py:  88: fire3-expand1x1_b not found
INFO net.py:  88: fire3-expand3x3_w not found
INFO net.py:  88: fire3-expand3x3_b not found
INFO net.py:  88: fire4-squeeze1x1_w not found
INFO net.py:  88: fire4-squeeze1x1_b not found
INFO net.py:  88: fire4-expand1x1_w not found
INFO net.py:  88: fire4-expand1x1_b not found
INFO net.py:  88: fire4-expand3x3_w not found
INFO net.py:  88: fire4-expand3x3_b not found
INFO net.py:  88: fire5-squeeze1x1_w not found
INFO net.py:  88: fire5-squeeze1x1_b not found
INFO net.py:  88: fire5-expand1x1_w not found
INFO net.py:  88: fire5-expand1x1_b not found
INFO net.py:  88: fire5-expand3x3_w not found
INFO net.py:  88: fire5-expand3x3_b not found
INFO net.py:  88: fire6-squeeze1x1_w not found
INFO net.py:  88: fire6-squeeze1x1_b not found
INFO net.py:  88: fire6-expand1x1_w not found
INFO net.py:  88: fire6-expand1x1_b not found
INFO net.py:  88: fire6-expand3x3_w not found
INFO net.py:  88: fire6-expand3x3_b not found
INFO net.py:  88: fire7-squeeze1x1_w not found
INFO net.py:  88: fire7-squeeze1x1_b not found
INFO net.py:  88: fire7-expand1x1_w not found
INFO net.py:  88: fire7-expand1x1_b not found
INFO net.py:  88: fire7-expand3x3_w not found
INFO net.py:  88: fire7-expand3x3_b not found
INFO net.py:  88: fire8-squeeze1x1_w not found
INFO net.py:  88: fire8-squeeze1x1_b not found
INFO net.py:  88: fire8-expand1x1_w not found
INFO net.py:  88: fire8-expand1x1_b not found
INFO net.py:  88: fire8-expand3x3_w not found
INFO net.py:  88: fire8-expand3x3_b not found
INFO net.py:  88: fire9-squeeze1x1_w not found
INFO net.py:  88: fire9-squeeze1x1_b not found
INFO net.py:  88: fire9-expand1x1_w not found
INFO net.py:  88: fire9-expand1x1_b not found
INFO net.py:  88: fire9-expand3x3_w not found
INFO net.py:  88: fire9-expand3x3_b not found
INFO net.py:  88: conv_rpn_w not found
INFO net.py:  88: conv_rpn_b not found
INFO net.py:  88: rpn_cls_logits_w not found
INFO net.py:  88: rpn_cls_logits_b not found
INFO net.py:  88: rpn_bbox_pred_w not found
INFO net.py:  88: rpn_bbox_pred_b not found
INFO net.py:  88: fc6_w not found
INFO net.py:  88: fc6_b not found
INFO net.py:  88: fc7_w not found
INFO net.py:  88: fc7_b not found
INFO net.py:  88: cls_score_w not found
INFO net.py:  88: cls_score_b not found
INFO net.py:  88: bbox_pred_w not found
INFO net.py:  88: bbox_pred_b not found
INFO net.py:  88: fcn1_w not found
INFO net.py:  88: fcn1_b not found
INFO net.py:  88: conv5_mask_w not found
INFO net.py:  88: conv5_mask_b not found
INFO net.py:  88: mask_fcn_logits_w not found
INFO net.py:  88: mask_fcn_logits_b not found
INFO net.py:  88: mask_fcn_logits_up_w not found
INFO net.py:  88: mask_fcn_logits_up_b not found
I0710 15:07:34.117735 22253 net_dag_utils.cc:102] Operator graph pruning prior to chain compute took: 0.000329956 secs
I0710 15:07:34.118227 22253 net_dag.cc:46] Number of parallel execution chains 230 Number of operators = 347
INFO train.py: 166: Outputs saved to: /home/thiemi/Detectron/detectron/train/squeezenet/config4_mask_small_head/train/coco_2014_train/generalized_rcnn
INFO loader.py: 229: Pre-filling mini-batch queue...
INFO loader.py: 234:   [0/64]
INFO loader.py: 234:   [2/64]
INFO loader.py: 234:   [4/64]
I0710 15:07:35.042738 23426 context_gpu.cu:314] GPU 0: 660 MB
I0710 15:07:35.042780 23426 context_gpu.cu:318] Total: 660 MB
INFO loader.py: 234:   [5/64]
INFO loader.py: 234:   [6/64]
I0710 15:07:35.274229 23426 context_gpu.cu:314] GPU 0: 788 MB
I0710 15:07:35.274263 23426 context_gpu.cu:318] Total: 788 MB
INFO loader.py: 234:   [8/64]
INFO loader.py: 234:   [11/64]
INFO loader.py: 234:   [16/64]
INFO loader.py: 234:   [21/64]
INFO loader.py: 234:   [27/64]
INFO loader.py: 234:   [32/64]
INFO loader.py: 234:   [36/64]
INFO loader.py: 234:   [41/64]
INFO loader.py: 234:   [47/64]
INFO loader.py: 234:   [52/64]
INFO loader.py: 234:   [56/64]
INFO loader.py: 234:   [60/64]
INFO detector.py: 479: Changing learning rate 0.000000 -> 0.000833 at iter 0
I0710 15:07:36.522253 23486 context_gpu.cu:314] GPU 0: 982 MB
I0710 15:07:36.522284 23486 context_gpu.cu:318] Total: 982 MB
I0710 15:07:36.526355 23486 context_gpu.cu:314] GPU 0: 1113 MB
I0710 15:07:36.526381 23486 context_gpu.cu:318] Total: 1113 MB
I0710 15:07:36.545336 23486 context_gpu.cu:314] GPU 0: 1250 MB
I0710 15:07:36.545369 23486 context_gpu.cu:318] Total: 1250 MB
I0710 15:07:36.554329 23488 context_gpu.cu:314] GPU 0: 1383 MB
I0710 15:07:36.554354 23488 context_gpu.cu:318] Total: 1383 MB
I0710 15:07:36.575675 23487 context_gpu.cu:314] GPU 0: 1533 MB
I0710 15:07:36.575712 23487 context_gpu.cu:318] Total: 1533 MB
I0710 15:07:36.599647 23487 context_gpu.cu:314] GPU 0: 1669 MB
I0710 15:07:36.599673 23487 context_gpu.cu:318] Total: 1669 MB
I0710 15:07:36.616735 23487 context_gpu.cu:314] GPU 0: 1821 MB
I0710 15:07:36.616761 23487 context_gpu.cu:318] Total: 1821 MB
I0710 15:07:36.648779 23487 context_gpu.cu:314] GPU 0: 1956 MB
I0710 15:07:36.648819 23487 context_gpu.cu:318] Total: 1956 MB
I0710 15:07:36.684108 23489 context_gpu.cu:314] GPU 0: 2086 MB
I0710 15:07:36.684149 23489 context_gpu.cu:318] Total: 2086 MB
I0710 15:07:36.749707 23486 context_gpu.cu:314] GPU 0: 2216 MB
I0710 15:07:36.749732 23486 context_gpu.cu:318] Total: 2216 MB
I0710 15:07:37.044325 23489 context_gpu.cu:314] GPU 0: 2449 MB
I0710 15:07:37.044363 23489 context_gpu.cu:318] Total: 2449 MB
I0710 15:07:37.085489 23489 context_gpu.cu:314] GPU 0: 2581 MB
I0710 15:07:37.085533 23489 context_gpu.cu:318] Total: 2581 MB
I0710 15:07:37.120723 23487 context_gpu.cu:314] GPU 0: 2715 MB
I0710 15:07:37.120757 23487 context_gpu.cu:318] Total: 2715 MB
I0710 15:07:37.131479 23487 context_gpu.cu:314] GPU 0: 2849 MB
I0710 15:07:37.131496 23487 context_gpu.cu:318] Total: 2849 MB
I0710 15:07:37.162952 23488 context_gpu.cu:314] GPU 0: 3059 MB
I0710 15:07:37.162981 23488 context_gpu.cu:318] Total: 3059 MB
INFO net.py: 212: Printing model: generalized_rcnn
INFO net.py: 249: data                        : (2, 3, 800, 1333)    => conv1                       : (2, 64, 399, 666)    ------- (op: Conv)
INFO net.py: 249: conv1                       : (2, 64, 399, 666)    => relu_conv1                  : (2, 64, 399, 666)    ------- (op: Relu)
INFO net.py: 249: relu_conv1                  : (2, 64, 399, 666)    => pool1                       : (2, 64, 199, 332)    ------- (op: MaxPool)
INFO net.py: 249: pool1                       : (2, 64, 199, 332)    => fire2-squeeze1x1            : (2, 16, 199, 332)    ------- (op: Conv)
INFO net.py: 249: fire2-squeeze1x1            : (2, 16, 199, 332)    => fire2-relu_sqeeze1x1        : (2, 16, 199, 332)    ------- (op: Relu)
INFO net.py: 249: fire2-relu_sqeeze1x1        : (2, 16, 199, 332)    => fire2-expand1x1             : (2, 64, 199, 332)    ------- (op: Conv)
INFO net.py: 249: fire2-expand1x1             : (2, 64, 199, 332)    => fire2-relu_expand1x1        : (2, 64, 199, 332)    ------- (op: Relu)
INFO net.py: 249: fire2-relu_sqeeze1x1        : (2, 16, 199, 332)    => fire2-expand3x3             : (2, 64, 199, 332)    ------- (op: Conv)
INFO net.py: 249: fire2-expand3x3             : (2, 64, 199, 332)    => fire2-relu_expand3x3        : (2, 64, 199, 332)    ------- (op: Relu)
INFO net.py: 249: fire2-relu_expand1x1        : (2, 64, 199, 332)    => fire2-concat                : (2, 128, 199, 332)   ------- (op: Concat)
INFO net.py: 249: fire2-relu_expand3x3        : (2, 64, 199, 332)    => fire2-concat                : (2, 128, 199, 332)   ------|
INFO net.py: 249: fire2-concat                : (2, 128, 199, 332)   => fire3-squeeze1x1            : (2, 16, 199, 332)    ------- (op: Conv)
INFO net.py: 249: fire3-squeeze1x1            : (2, 16, 199, 332)    => fire3-relu_squeeze1x1       : (2, 16, 199, 332)    ------- (op: Relu)
INFO net.py: 249: fire3-relu_squeeze1x1       : (2, 16, 199, 332)    => fire3-expand1x1             : (2, 64, 199, 332)    ------- (op: Conv)
INFO net.py: 249: fire3-expand1x1             : (2, 64, 199, 332)    => fire3-relu_expand1x1        : (2, 64, 199, 332)    ------- (op: Relu)
INFO net.py: 249: fire3-relu_squeeze1x1       : (2, 16, 199, 332)    => fire3-expand3x3             : (2, 64, 199, 332)    ------- (op: Conv)
INFO net.py: 249: fire3-expand3x3             : (2, 64, 199, 332)    => fire3-relu_expand3x3        : (2, 64, 199, 332)    ------- (op: Relu)
INFO net.py: 249: fire3-relu_expand1x1        : (2, 64, 199, 332)    => fire3-concat                : (2, 128, 199, 332)   ------- (op: Concat)
INFO net.py: 249: fire3-relu_expand3x3        : (2, 64, 199, 332)    => fire3-concat                : (2, 128, 199, 332)   ------|
INFO net.py: 249: fire3-concat                : (2, 128, 199, 332)   => pool3                       : (2, 128, 99, 165)    ------- (op: MaxPool)
INFO net.py: 249: pool3                       : (2, 128, 99, 165)    => fire4-squeeze1x1            : (2, 32, 99, 165)     ------- (op: Conv)
INFO net.py: 249: fire4-squeeze1x1            : (2, 32, 99, 165)     => fire4-relu_squeeze1x1       : (2, 32, 99, 165)     ------- (op: Relu)
INFO net.py: 249: fire4-relu_squeeze1x1       : (2, 32, 99, 165)     => fire4-expand1x1             : (2, 128, 99, 165)    ------- (op: Conv)
INFO net.py: 249: fire4-expand1x1             : (2, 128, 99, 165)    => fire4-relu_expand1x1        : (2, 128, 99, 165)    ------- (op: Relu)
INFO net.py: 249: fire4-relu_squeeze1x1       : (2, 32, 99, 165)     => fire4-expand3x3             : (2, 128, 99, 165)    ------- (op: Conv)
INFO net.py: 249: fire4-expand3x3             : (2, 128, 99, 165)    => fire4-relu_expand3x3        : (2, 128, 99, 165)    ------- (op: Relu)
INFO net.py: 249: fire4-relu_expand1x1        : (2, 128, 99, 165)    => fire4-concat                : (2, 256, 99, 165)    ------- (op: Concat)
INFO net.py: 249: fire4-relu_expand3x3        : (2, 128, 99, 165)    => fire4-concat                : (2, 256, 99, 165)    ------|
INFO net.py: 249: fire4-concat                : (2, 256, 99, 165)    => fire5-squeeze1x1            : (2, 32, 99, 165)     ------- (op: Conv)
INFO net.py: 249: fire5-squeeze1x1            : (2, 32, 99, 165)     => fire5-relu_squeeze1x1       : (2, 32, 99, 165)     ------- (op: Relu)
INFO net.py: 249: fire5-relu_squeeze1x1       : (2, 32, 99, 165)     => fire5-expand1x1             : (2, 128, 99, 165)    ------- (op: Conv)
INFO net.py: 249: fire5-expand1x1             : (2, 128, 99, 165)    => fire5-relu_expand1x1        : (2, 128, 99, 165)    ------- (op: Relu)
INFO net.py: 249: fire5-relu_squeeze1x1       : (2, 32, 99, 165)     => fire5-expand3x3             : (2, 128, 99, 165)    ------- (op: Conv)
INFO net.py: 249: fire5-expand3x3             : (2, 128, 99, 165)    => fire5-relu_expand3x3        : (2, 128, 99, 165)    ------- (op: Relu)
INFO net.py: 249: fire5-relu_expand1x1        : (2, 128, 99, 165)    => fire5-concat                : (2, 256, 99, 165)    ------- (op: Concat)
INFO net.py: 249: fire5-relu_expand3x3        : (2, 128, 99, 165)    => fire5-concat                : (2, 256, 99, 165)    ------|
INFO net.py: 249: fire5-concat                : (2, 256, 99, 165)    => pool5                       : (2, 256, 49, 82)     ------- (op: MaxPool)
INFO net.py: 249: pool5                       : (2, 256, 49, 82)     => fire6-squeeze1x1            : (2, 48, 49, 82)      ------- (op: Conv)
INFO net.py: 249: fire6-squeeze1x1            : (2, 48, 49, 82)      => fire6-relu_squeeze1x1       : (2, 48, 49, 82)      ------- (op: Relu)
INFO net.py: 249: fire6-relu_squeeze1x1       : (2, 48, 49, 82)      => fire6-expand1x1             : (2, 192, 49, 82)     ------- (op: Conv)
INFO net.py: 249: fire6-expand1x1             : (2, 192, 49, 82)     => fire6-relu_expand1x1        : (2, 192, 49, 82)     ------- (op: Relu)
INFO net.py: 249: fire6-relu_squeeze1x1       : (2, 48, 49, 82)      => fire6-expand3x3             : (2, 192, 49, 82)     ------- (op: Conv)
INFO net.py: 249: fire6-expand3x3             : (2, 192, 49, 82)     => fire6-relu_expand3x3        : (2, 192, 49, 82)     ------- (op: Relu)
INFO net.py: 249: fire6-relu_expand1x1        : (2, 192, 49, 82)     => fire6-concat                : (2, 384, 49, 82)     ------- (op: Concat)
INFO net.py: 249: fire6-relu_expand3x3        : (2, 192, 49, 82)     => fire6-concat                : (2, 384, 49, 82)     ------|
INFO net.py: 249: fire6-concat                : (2, 384, 49, 82)     => fire7-squeeze1x1            : (2, 48, 49, 82)      ------- (op: Conv)
INFO net.py: 249: fire7-squeeze1x1            : (2, 48, 49, 82)      => fire7-relu_squeeze1x1       : (2, 48, 49, 82)      ------- (op: Relu)
INFO net.py: 249: fire7-relu_squeeze1x1       : (2, 48, 49, 82)      => fire7-expand1x1             : (2, 192, 49, 82)     ------- (op: Conv)
INFO net.py: 249: fire7-expand1x1             : (2, 192, 49, 82)     => fire7-relu_expand1x1        : (2, 192, 49, 82)     ------- (op: Relu)
INFO net.py: 249: fire7-relu_squeeze1x1       : (2, 48, 49, 82)      => fire7-expand3x3             : (2, 192, 49, 82)     ------- (op: Conv)
INFO net.py: 249: fire7-expand3x3             : (2, 192, 49, 82)     => fire7-relu_expand3x3        : (2, 192, 49, 82)     ------- (op: Relu)
INFO net.py: 249: fire7-relu_expand1x1        : (2, 192, 49, 82)     => fire7-concat                : (2, 384, 49, 82)     ------- (op: Concat)
INFO net.py: 249: fire7-relu_expand3x3        : (2, 192, 49, 82)     => fire7-concat                : (2, 384, 49, 82)     ------|
INFO net.py: 249: fire7-concat                : (2, 384, 49, 82)     => fire8-squeeze1x1            : (2, 64, 49, 82)      ------- (op: Conv)
INFO net.py: 249: fire8-squeeze1x1            : (2, 64, 49, 82)      => fire8-relu_squeeze1x1       : (2, 64, 49, 82)      ------- (op: Relu)
INFO net.py: 249: fire8-relu_squeeze1x1       : (2, 64, 49, 82)      => fire8-expand1x1             : (2, 256, 49, 82)     ------- (op: Conv)
INFO net.py: 249: fire8-expand1x1             : (2, 256, 49, 82)     => fire8-relu_expand1x1        : (2, 256, 49, 82)     ------- (op: Relu)
INFO net.py: 249: fire8-relu_squeeze1x1       : (2, 64, 49, 82)      => fire8-expand3x3             : (2, 256, 49, 82)     ------- (op: Conv)
INFO net.py: 249: fire8-expand3x3             : (2, 256, 49, 82)     => fire8-relu_expand3x3        : (2, 256, 49, 82)     ------- (op: Relu)
INFO net.py: 249: fire8-relu_expand1x1        : (2, 256, 49, 82)     => fire8-concat                : (2, 512, 49, 82)     ------- (op: Concat)
INFO net.py: 249: fire8-relu_expand3x3        : (2, 256, 49, 82)     => fire8-concat                : (2, 512, 49, 82)     ------|
INFO net.py: 249: fire8-concat                : (2, 512, 49, 82)     => fire9-squeeze1x1            : (2, 64, 49, 82)      ------- (op: Conv)
INFO net.py: 249: fire9-squeeze1x1            : (2, 64, 49, 82)      => fire9-relu_squeeze1x1       : (2, 64, 49, 82)      ------- (op: Relu)
INFO net.py: 249: fire9-relu_squeeze1x1       : (2, 64, 49, 82)      => fire9-expand1x1             : (2, 256, 49, 82)     ------- (op: Conv)
INFO net.py: 249: fire9-expand1x1             : (2, 256, 49, 82)     => fire9-relu_expand1x1        : (2, 256, 49, 82)     ------- (op: Relu)
INFO net.py: 249: fire9-relu_squeeze1x1       : (2, 64, 49, 82)      => fire9-expand3x3             : (2, 256, 49, 82)     ------- (op: Conv)
INFO net.py: 249: fire9-expand3x3             : (2, 256, 49, 82)     => fire9-relu_expand3x3        : (2, 256, 49, 82)     ------- (op: Relu)
INFO net.py: 249: fire9-relu_expand1x1        : (2, 256, 49, 82)     => fire9-concat                : (2, 512, 49, 82)     ------- (op: Concat)
INFO net.py: 249: fire9-relu_expand3x3        : (2, 256, 49, 82)     => fire9-concat                : (2, 512, 49, 82)     ------|
INFO net.py: 249: fire9-concat                : (2, 512, 49, 82)     => drop9                       : (2, 512, 49, 82)     ------- (op: Dropout)
INFO net.py: 249: drop9                       : (2, 512, 49, 82)     => conv10                      : (2, 1000, 49, 82)    ------- (op: Conv)
INFO net.py: 249: conv10                      : (2, 1000, 49, 82)    => relu_conv10                 : (2, 1000, 49, 82)    ------- (op: Relu)
INFO net.py: 249: relu_conv10                 : (2, 1000, 49, 82)    => pool10                      : (2, 1000, 1, 1)      ------- (op: AveragePool)
INFO net.py: 249: pool10                      : (2, 1000, 1, 1)      => prob                        : (2, 1000, 1, 1)      ------- (op: Softmax)
INFO net.py: 249: prob                        : (2, 1000, 1, 1)      => conv_rpn                    : (2, 1000, 1, 1)      ------- (op: Conv)
INFO net.py: 249: conv_rpn                    : (2, 1000, 1, 1)      => conv_rpn                    : (2, 1000, 1, 1)      ------- (op: Relu)
INFO net.py: 249: conv_rpn                    : (2, 1000, 1, 1)      => rpn_cls_logits              : (2, 12, 1, 1)        ------- (op: Conv)
INFO net.py: 249: conv_rpn                    : (2, 1000, 1, 1)      => rpn_bbox_pred               : (2, 48, 1, 1)        ------- (op: Conv)
INFO net.py: 249: rpn_cls_logits              : (2, 12, 1, 1)        => rpn_cls_probs               : (2, 12, 1, 1)        ------- (op: Sigmoid)
INFO net.py: 249: rpn_cls_probs               : (2, 12, 1, 1)        => rpn_rois                    : (24, 5)              ------- (op: Python:GenerateProposalsOp:rpn_cls_probs,rpn_bbox_pred,im_info)
INFO net.py: 249: rpn_bbox_pred               : (2, 48, 1, 1)        => rpn_rois                    : (24, 5)              ------|
INFO net.py: 249: im_info                     : (2, 3)               => rpn_rois                    : (24, 5)              ------|
INFO net.py: 249: rpn_rois                    : (24, 5)              => rois                        : (28, 5)              ------- (op: Python:GenerateProposalLabelsOp:rpn_rois,roidb,im_info)
INFO net.py: 249: roidb                       : (7028,)              => rois                        : (28, 5)              ------|
INFO net.py: 249: im_info                     : (2, 3)               => rois                        : (28, 5)              ------|
INFO net.py: 249: rpn_labels_int32_wide       : (2, 12, 84, 84)      => rpn_labels_int32            : (2, 12, 1, 1)        ------- (op: SpatialNarrowAs)
INFO net.py: 249: rpn_cls_logits              : (2, 12, 1, 1)        => rpn_labels_int32            : (2, 12, 1, 1)        ------|
INFO net.py: 249: rpn_bbox_targets_wide       : (2, 48, 84, 84)      => rpn_bbox_targets            : (2, 48, 1, 1)        ------- (op: SpatialNarrowAs)
INFO net.py: 249: rpn_bbox_pred               : (2, 48, 1, 1)        => rpn_bbox_targets            : (2, 48, 1, 1)        ------|
INFO net.py: 249: rpn_bbox_inside_weights_wide: (2, 48, 84, 84)      => rpn_bbox_inside_weights     : (2, 48, 1, 1)        ------- (op: SpatialNarrowAs)
INFO net.py: 249: rpn_bbox_pred               : (2, 48, 1, 1)        => rpn_bbox_inside_weights     : (2, 48, 1, 1)        ------|
INFO net.py: 249: rpn_bbox_outside_weights_wide: (2, 48, 84, 84)      => rpn_bbox_outside_weights    : (2, 48, 1, 1)        ------- (op: SpatialNarrowAs)
INFO net.py: 249: rpn_bbox_pred               : (2, 48, 1, 1)        => rpn_bbox_outside_weights    : (2, 48, 1, 1)        ------|
INFO net.py: 249: rpn_cls_logits              : (2, 12, 1, 1)        => loss_rpn_cls                : ()                   ------- (op: SigmoidCrossEntropyLoss)
INFO net.py: 249: rpn_labels_int32            : (2, 12, 1, 1)        => loss_rpn_cls                : ()                   ------|
INFO net.py: 249: rpn_bbox_pred               : (2, 48, 1, 1)        => loss_rpn_bbox               : ()                   ------- (op: SmoothL1Loss)
INFO net.py: 249: rpn_bbox_targets            : (2, 48, 1, 1)        => loss_rpn_bbox               : ()                   ------|
INFO net.py: 249: rpn_bbox_inside_weights     : (2, 48, 1, 1)        => loss_rpn_bbox               : ()                   ------|
INFO net.py: 249: rpn_bbox_outside_weights    : (2, 48, 1, 1)        => loss_rpn_bbox               : ()                   ------|
INFO net.py: 249: prob                        : (2, 1000, 1, 1)      => roi_feat                    : (28, 1000, 7, 7)     ------- (op: RoIAlign)
INFO net.py: 249: rois                        : (28, 5)              => roi_feat                    : (28, 1000, 7, 7)     ------|
INFO net.py: 249: roi_feat                    : (28, 1000, 7, 7)     => fc6                         : (28, 1024)           ------- (op: FC)
INFO net.py: 249: fc6                         : (28, 1024)           => fc6                         : (28, 1024)           ------- (op: Relu)
INFO net.py: 249: fc6                         : (28, 1024)           => fc7                         : (28, 1024)           ------- (op: FC)
INFO net.py: 249: fc7                         : (28, 1024)           => fc7                         : (28, 1024)           ------- (op: Relu)
INFO net.py: 249: fc7                         : (28, 1024)           => cls_score                   : (28, 81)             ------- (op: FC)
INFO net.py: 249: fc7                         : (28, 1024)           => bbox_pred                   : (28, 324)            ------- (op: FC)
INFO net.py: 249: cls_score                   : (28, 81)             => cls_prob                    : (28, 81)             ------- (op: SoftmaxWithLoss)
INFO net.py: 249: labels_int32                : (28,)                => cls_prob                    : (28, 81)             ------|
INFO net.py: 249: bbox_pred                   : (28, 324)            => loss_bbox                   : ()                   ------- (op: SmoothL1Loss)
INFO net.py: 249: bbox_targets                : (28, 324)            => loss_bbox                   : ()                   ------|
INFO net.py: 249: bbox_inside_weights         : (28, 324)            => loss_bbox                   : ()                   ------|
INFO net.py: 249: bbox_outside_weights        : (28, 324)            => loss_bbox                   : ()                   ------|
INFO net.py: 249: cls_prob                    : (28, 81)             => accuracy_cls                : ()                   ------- (op: Accuracy)
INFO net.py: 249: labels_int32                : (28,)                => accuracy_cls                : ()                   ------|
INFO net.py: 249: prob                        : (2, 1000, 1, 1)      => _[mask]_roi_feat            : (4, 1000, 7, 7)      ------- (op: RoIAlign)
INFO net.py: 249: mask_rois                   : (4, 5)               => _[mask]_roi_feat            : (4, 1000, 7, 7)      ------|
INFO net.py: 249: _[mask]_roi_feat            : (4, 1000, 7, 7)      => _[mask]_fcn1                : (4, 1000, 7, 7)      ------- (op: Conv)
INFO net.py: 249: _[mask]_fcn1                : (4, 1000, 7, 7)      => _[mask]_fcn1                : (4, 1000, 7, 7)      ------- (op: Relu)
INFO net.py: 249: _[mask]_fcn1                : (4, 1000, 7, 7)      => conv5_mask                  : (4, 1000, 14, 14)    ------- (op: ConvTranspose)
INFO net.py: 249: conv5_mask                  : (4, 1000, 14, 14)    => conv5_mask                  : (4, 1000, 14, 14)    ------- (op: Relu)
INFO net.py: 249: conv5_mask                  : (4, 1000, 14, 14)    => mask_fcn_logits             : (4, 81, 14, 14)      ------- (op: Conv)
INFO net.py: 249: mask_fcn_logits             : (4, 81, 14, 14)      => mask_fcn_logits_up          : (4, 81, 28, 28)      ------- (op: ConvTranspose)
INFO net.py: 249: mask_fcn_logits_up          : (4, 81, 28, 28)      => loss_mask                   : ()                   ------- (op: SigmoidCrossEntropyLoss)
INFO net.py: 249: masks_int32                 : (4, 63504)           => loss_mask                   : ()                   ------|

I am using a (pretrained?) .pkl file. Since there is no squeezenet .pkl file available I converted a .caffemodel file to .pkl format using the Detectron script pickle_caffe_blobs.py. Then I re-built the squeezenet model structure in Caffe2 manually referencing the equivalent .prototxt file that was contained in the repository where also the caffemodel file was. The only thing that really occurs to me from this output is the lines stating 'XXXXX not found' where a layer does not seem to correspond with the layer I defined in Caffe2. But is that really that big of a problem?

liuliu66 commented 6 years ago

@johannathiemich Hi, I think your problem occurs here:

INFO net.py: 249: relu_conv10 : (2, 1000, 49, 82) => pool10 : (2, 1000, 1, 1) ------- (op: AveragePool) INFO net.py: 249: pool10 : (2, 1000, 1, 1) => prob : (2, 1000, 1, 1) ------- (op: Softmax) INFO net.py: 249: prob : (2, 1000, 1, 1) => conv_rpn : (2, 1000, 1, 1) ------- (op: Conv)

the average pool, softmax and conv layers are used in original Squeenzenet model because the net is used for classification so the output tensor shape is (2, 1000, 1, 1) in which (1, 1) represents the 1 by 1 integer but our goal is to localization so it needs to be processed in the next several layers.

the solution I think is to stop your manual wrote Squeenzenet at fire9-concat and delete the following pool, softmax and conv layers to ensure the fire9-concat is linked to conv_rpn layer. Re-run your net, I believe your result could be fine.

liuliu66 commented 6 years ago

@johannathiemich By the way, if you want to use pretrained model .pkl file to fint-tune your model. you can use Python package json (just import json) and use f=json.load(open('xxx.pkl', 'r')) to read and parse pkl file. the output would be a dictionary type variable, in which the keys in this dict must be equal to the blob names that defined in your Squeezenet code. Check them and if they are matched your model could read the pretrained pkl.

johannathiemich commented 6 years ago

@liuliu66 Thanks again very much for your answer, I adapted my implementation and I am running training again now. I will let you know if it worked!

johannathiemich commented 6 years ago

@liuliu66 unfortunately, until now it does not look like that was the solution for my problem. Also when trying to execute f=json.load(open('xxx.pkl', 'r')) (with the right path to the file, of course) I get the error: ValueError: No JSON object could be decoded

liuliu66 commented 6 years ago

@johannathiemich Hi, could you paste part of your log in your training after revising the network? And sorry for the mistake I made for the pkl loading. try to use this code would be fine.

import cPickle as pickle f = pickle.load(open('xxx.pkl', 'r'))

johannathiemich commented 6 years ago

@liuliu66 No problem, in the meantime I found a piece of code that seems to work. When printing the layer of the .pkl file, the following layer names are shown: [u'fire4/expand3x3_w', u'fire2/expand3x3_b', u'fire4/squeeze1x1_w', u'fire6/expand1x1_w', u'fire8/squeeze1x1_b', u'fire9/expand1x1_b', u'fire2/expand3x3_w', u'fire4/expand3x3_b', u'fire6/expand1x1_b', u'fire9/expand1x1_w', u'fire4/squeeze1x1_b', u'fire6/squeeze1x1_w', u'fire2/squeeze1x1_w', u'fire3/squeeze1x1_b', u'fire4/expand1x1_b', u'fire6/expand3x3_b', u'fire2/squeeze1x1_b', u'fire4/expand1x1_w', u'fire6/squeeze1x1_b', u'conv1_w', u'fire2/expand1x1_w', u'fire5/expand3x3_w', u'fire6/expand3x3_w', u'fire5/expand3x3_b', u'fire7/expand3x3_b', u'fire7/expand1x1_w', u'fire7/squeeze1x1_b', u'fire3/expand1x1_w', u'fire7/expand3x3_w', u'fire8/squeeze1x1_w', u'conv10_b', u'fire8/expand3x3_b', u'fire3/expand1x1_b', u'fire7/expand1x1_b', u'fire7/squeeze1x1_w', u'fire8/expand3x3_w', u'conv10_w', u'fire9/squeeze1x1_b', u'fire3/expand3x3_w', u'fire3/squeeze1x1_w', u'fire5/expand1x1_b', u'fire8/expand1x1_b', u'fire9/expand3x3_b', u'fire2/expand1x1_b', u'fire3/expand3x3_b', u'fire5/squeeze1x1_w', u'conv1_b', u'fire9/squeeze1x1_w', u'fire9/expand3x3_w', u'fire5/squeeze1x1_b', u'fire8/expand1x1_w', u'fire5/expand1x1_w'] Yet still, when I rename my network layers to exactly those layers, it still says that it cannot find the following layers:

NFO net.py:  59: Loading weights from: /home/thiemi/Detectron/detectron/models/squeezenet/squeezenet_model.pkl
INFO net.py:  88: fire2-squeeze1x1_w_w not found
INFO net.py:  88: fire2-squeeze1x1_w_b not found
INFO net.py:  88: fire2-expand1x1_w_w not found
INFO net.py:  88: fire2-expand1x1_w_b not found
INFO net.py:  88: fire2-expand3x3_w_w not found
INFO net.py:  88: fire2-expand3x3_w_b not found
INFO net.py:  88: fire3-squeeze1x1_w_w not found
INFO net.py:  88: fire3-squeeze1x1_w_b not found
INFO net.py:  88: fire3-expand1x1_w_w not found
INFO net.py:  88: fire3-expand1x1_w_b not found
INFO net.py:  88: fire3-expand3x3_w_w not found
INFO net.py:  88: fire3-expand3x3_w_b not found
INFO net.py:  88: fire4-squeeze1x1_w_w not found
INFO net.py:  88: fire4-squeeze1x1_w_b not found
INFO net.py:  88: fire4-expand1x1_w_w not found
INFO net.py:  88: fire4-expand1x1_w_b not found
INFO net.py:  88: fire4-expand3x3_w_w not found
INFO net.py:  88: fire4-expand3x3_w_b not found
INFO net.py:  88: fire5-squeeze1x1_w_w not found
INFO net.py:  88: fire5-squeeze1x1_w_b not found
INFO net.py:  88: fire5-expand1x1_w_w not found
INFO net.py:  88: fire5-expand1x1_w_b not found
INFO net.py:  88: fire5-expand3x3_w_w not found
INFO net.py:  88: fire5-expand3x3_w_b not found
INFO net.py:  88: fire6-squeeze1x1_w_w not found
INFO net.py:  88: fire6-squeeze1x1_w_b not found
INFO net.py:  88: fire6-expand1x1_w_w not found
INFO net.py:  88: fire6-expand1x1_w_b not found
INFO net.py:  88: fire6-expand3x3_w_w not found
INFO net.py:  88: fire6-expand3x3_w_b not found
INFO net.py:  88: fire7-squeeze1x1_w_w not found
INFO net.py:  88: fire7-squeeze1x1_w_b not found
INFO net.py:  88: fire7-expand1x1_w_w not found
INFO net.py:  88: fire7-expand1x1_w_b not found
INFO net.py:  88: fire7-expand3x3_w_w not found
INFO net.py:  88: fire7-expand3x3_w_b not found
INFO net.py:  88: fire8-squeeze1x1_w_w not found
INFO net.py:  88: fire8-squeeze1x1_w_b not found
INFO net.py:  88: fire8-expand1x1_w_w not found
INFO net.py:  88: fire8-expand1x1_w_b not found
INFO net.py:  88: fire8-expand3x3_w_w not found
INFO net.py:  88: fire8-expand3x3_w_b not found
INFO net.py:  88: fire9-squeeze1x1_w_w not found
INFO net.py:  88: fire9-squeeze1x1_w_b not found
INFO net.py:  88: fire9-expand1x1_w_w not found
INFO net.py:  88: fire9-expand1x1_w_b not found
INFO net.py:  88: fire9-expand3x3_w_w not found
INFO net.py:  88: fire9-expand3x3_w_b not found
INFO net.py:  88: conv_rpn_w not found
INFO net.py:  88: conv_rpn_b not found
INFO net.py:  88: rpn_cls_logits_w not found
INFO net.py:  88: rpn_cls_logits_b not found
INFO net.py:  88: rpn_bbox_pred_w not found
INFO net.py:  88: rpn_bbox_pred_b not found
INFO net.py:  88: fc6_w not found
INFO net.py:  88: fc6_b not found
INFO net.py:  88: fc7_w not found
INFO net.py:  88: fc7_b not found
INFO net.py:  88: cls_score_w not found
INFO net.py:  88: cls_score_b not found
INFO net.py:  88: bbox_pred_w not found
INFO net.py:  88: bbox_pred_b not found
INFO net.py:  88: fcn1_w not found
INFO net.py:  88: fcn1_b not found
INFO net.py:  88: fcn2_w not found
INFO net.py:  88: fcn2_b not found
INFO net.py:  88: fcn3_w not found
INFO net.py:  88: fcn3_b not found
INFO net.py:  88: fcn4_w not found
INFO net.py:  88: fcn4_b not found
INFO net.py:  88: conv5_mask_w not found
INFO net.py:  88: conv5_mask_b not found
INFO net.py:  88: mask_fcn_logits_w not found
INFO net.py:  88: mask_fcn_logits_b not found
INFO net.py:  88: mask_fcn_logits_up_w not found
INFO net.py:  88: mask_fcn_logits_up_b not found
I0711 08:14:41.307646 46386 net_dag_utils.cc:102] Operator graph pruning prior to chain compute took: 0.000321134 secs
I0711 08:14:41.308210 46386 net_dag.cc:46] Number of parallel execution chains 237 Number of operators = 357
INFO train.py: 166: Outputs saved to: /home/thiemi/Detectron/detectron/train/squeezenet/config5_droplayer/train/coco_2014_train/generalized_rcnn
INFO loader.py: 229: Pre-filling mini-batch queue...
INFO loader.py: 234:   [0/64]
INFO loader.py: 234:   [1/64]
I0711 08:14:42.895750 47020 context_gpu.cu:314] GPU 0: 280 MB
I0711 08:14:42.895783 47020 context_gpu.cu:318] Total: 280 MB
INFO loader.py: 234:   [4/64]
INFO loader.py: 234:   [5/64]
I0711 08:14:43.093822 47020 context_gpu.cu:314] GPU 0: 408 MB
I0711 08:14:43.093861 47020 context_gpu.cu:318] Total: 408 MB
INFO loader.py: 234:   [6/64]
INFO loader.py: 234:   [9/64]
INFO loader.py: 234:   [14/64]
INFO loader.py: 234:   [19/64]
INFO loader.py: 234:   [23/64]
INFO loader.py: 234:   [28/64]
INFO loader.py: 234:   [33/64]
INFO loader.py: 234:   [39/64]
INFO loader.py: 234:   [43/64]
INFO loader.py: 234:   [48/64]
INFO loader.py: 234:   [53/64]
INFO loader.py: 234:   [59/64]
INFO detector.py: 479: Changing learning rate 0.000000 -> 0.006667 at iter 0
I0711 08:14:44.409967 47020 context_gpu.cu:314] GPU 0: 560 MB
I0711 08:14:44.410001 47020 context_gpu.cu:318] Total: 560 MB
I0711 08:14:45.311997 47072 context_gpu.cu:314] GPU 0: 698 MB
I0711 08:14:45.312031 47072 context_gpu.cu:318] Total: 698 MB
I0711 08:14:45.315218 47072 context_gpu.cu:314] GPU 0: 829 MB
I0711 08:14:45.315233 47072 context_gpu.cu:318] Total: 829 MB
I0711 08:14:45.325091 47069 context_gpu.cu:314] GPU 0: 974 MB
I0711 08:14:45.325107 47069 context_gpu.cu:318] Total: 974 MB
I0711 08:14:45.333859 47069 context_gpu.cu:314] GPU 0: 1120 MB
I0711 08:14:45.333875 47069 context_gpu.cu:318] Total: 1120 MB
I0711 08:14:45.343745 47070 context_gpu.cu:314] GPU 0: 1281 MB
I0711 08:14:45.343771 47070 context_gpu.cu:318] Total: 1281 MB
I0711 08:14:45.363572 47072 context_gpu.cu:314] GPU 0: 1424 MB
I0711 08:14:45.363598 47072 context_gpu.cu:318] Total: 1424 MB
I0711 08:14:45.381585 47070 context_gpu.cu:314] GPU 0: 1553 MB
I0711 08:14:45.381611 47070 context_gpu.cu:318] Total: 1553 MB
I0711 08:14:45.409215 47072 context_gpu.cu:314] GPU 0: 1691 MB
I0711 08:14:45.409235 47072 context_gpu.cu:318] Total: 1691 MB
I0711 08:14:45.473052 47070 context_gpu.cu:314] GPU 0: 1826 MB
I0711 08:14:45.473100 47070 context_gpu.cu:318] Total: 1826 MB
I0711 08:14:45.801800 47070 context_gpu.cu:314] GPU 0: 1995 MB
I0711 08:14:45.801837 47070 context_gpu.cu:318] Total: 1995 MB
I0711 08:14:45.842908 47072 context_gpu.cu:314] GPU 0: 2126 MB
I0711 08:14:45.842936 47072 context_gpu.cu:318] Total: 2126 MB
I0711 08:14:45.867106 47072 context_gpu.cu:314] GPU 0: 2301 MB
I0711 08:14:45.867123 47072 context_gpu.cu:318] Total: 2301 MB
I0711 08:14:45.882791 47072 context_gpu.cu:314] GPU 0: 2454 MB
I0711 08:14:45.882808 47072 context_gpu.cu:318] Total: 2454 MB
I0711 08:14:45.892621 47072 context_gpu.cu:314] GPU 0: 2592 MB
I0711 08:14:45.892638 47072 context_gpu.cu:318] Total: 2592 MB
INFO net.py: 212: Printing model: generalized_rcnn
INFO net.py: 249: data                        : (2, 3, 800, 1333)    => conv1                       : (2, 64, 399, 666)    ------- (op: Conv)
INFO net.py: 249: conv1                       : (2, 64, 399, 666)    => relu_conv1                  : (2, 64, 399, 666)    ------- (op: Relu)
INFO net.py: 249: relu_conv1                  : (2, 64, 399, 666)    => pool1                       : (2, 64, 199, 332)    ------- (op: MaxPool)
INFO net.py: 249: pool1                       : (2, 64, 199, 332)    => fire2-squeeze1x1_w          : (2, 16, 199, 332)    ------- (op: Conv)
INFO net.py: 249: fire2-squeeze1x1_w          : (2, 16, 199, 332)    => fire2-sqeeze1x1_b           : (2, 16, 199, 332)    ------- (op: Relu)
INFO net.py: 249: fire2-sqeeze1x1_b           : (2, 16, 199, 332)    => fire2-expand1x1_w           : (2, 64, 199, 332)    ------- (op: Conv)
INFO net.py: 249: fire2-expand1x1_w           : (2, 64, 199, 332)    => fire2-expand1x1_b           : (2, 64, 199, 332)    ------- (op: Relu)
INFO net.py: 249: fire2-sqeeze1x1_b           : (2, 16, 199, 332)    => fire2-expand3x3_w           : (2, 64, 199, 332)    ------- (op: Conv)
INFO net.py: 249: fire2-expand3x3_w           : (2, 64, 199, 332)    => fire2-expand3x3_b           : (2, 64, 199, 332)    ------- (op: Relu)
INFO net.py: 249: fire2-expand1x1_b           : (2, 64, 199, 332)    => fire2-concat                : (2, 128, 199, 332)   ------- (op: Concat)
INFO net.py: 249: fire2-expand3x3_b           : (2, 64, 199, 332)    => fire2-concat                : (2, 128, 199, 332)   ------|
INFO net.py: 249: fire2-concat                : (2, 128, 199, 332)   => fire3-squeeze1x1_w          : (2, 16, 199, 332)    ------- (op: Conv)
INFO net.py: 249: fire3-squeeze1x1_w          : (2, 16, 199, 332)    => fire3-squeeze1x1_b          : (2, 16, 199, 332)    ------- (op: Relu)
INFO net.py: 249: fire3-squeeze1x1_b          : (2, 16, 199, 332)    => fire3-expand1x1_w           : (2, 64, 199, 332)    ------- (op: Conv)
INFO net.py: 249: fire3-expand1x1_w           : (2, 64, 199, 332)    => fire3-expand1x1_b           : (2, 64, 199, 332)    ------- (op: Relu)
INFO net.py: 249: fire3-squeeze1x1_b          : (2, 16, 199, 332)    => fire3-expand3x3_w           : (2, 64, 199, 332)    ------- (op: Conv)
INFO net.py: 249: fire3-expand3x3_w           : (2, 64, 199, 332)    => fire3-expand3x3_b           : (2, 64, 199, 332)    ------- (op: Relu)
INFO net.py: 249: fire3-expand1x1_b           : (2, 64, 199, 332)    => fire3-concat                : (2, 128, 199, 332)   ------- (op: Concat)
INFO net.py: 249: fire3-expand3x3_b           : (2, 64, 199, 332)    => fire3-concat                : (2, 128, 199, 332)   ------|
INFO net.py: 249: fire3-concat                : (2, 128, 199, 332)   => pool3                       : (2, 128, 99, 165)    ------- (op: MaxPool)
INFO net.py: 249: pool3                       : (2, 128, 99, 165)    => fire4-squeeze1x1_w          : (2, 32, 99, 165)     ------- (op: Conv)
INFO net.py: 249: fire4-squeeze1x1_w          : (2, 32, 99, 165)     => fire4-squeeze1x1_b          : (2, 32, 99, 165)     ------- (op: Relu)
INFO net.py: 249: fire4-squeeze1x1_b          : (2, 32, 99, 165)     => fire4-expand1x1_w           : (2, 128, 99, 165)    ------- (op: Conv)
INFO net.py: 249: fire4-expand1x1_w           : (2, 128, 99, 165)    => fire4-expand1x1_b           : (2, 128, 99, 165)    ------- (op: Relu)
INFO net.py: 249: fire4-squeeze1x1_b          : (2, 32, 99, 165)     => fire4-expand3x3_w           : (2, 128, 99, 165)    ------- (op: Conv)
INFO net.py: 249: fire4-expand3x3_w           : (2, 128, 99, 165)    => fire4-expand3x3_b           : (2, 128, 99, 165)    ------- (op: Relu)
INFO net.py: 249: fire4-expand1x1_b           : (2, 128, 99, 165)    => fire4-concat                : (2, 256, 99, 165)    ------- (op: Concat)
INFO net.py: 249: fire4-expand3x3_b           : (2, 128, 99, 165)    => fire4-concat                : (2, 256, 99, 165)    ------|
INFO net.py: 249: fire4-concat                : (2, 256, 99, 165)    => fire5-squeeze1x1_w          : (2, 32, 99, 165)     ------- (op: Conv)
INFO net.py: 249: fire5-squeeze1x1_w          : (2, 32, 99, 165)     => fire5-squeeze1x1_b          : (2, 32, 99, 165)     ------- (op: Relu)
INFO net.py: 249: fire5-squeeze1x1_b          : (2, 32, 99, 165)     => fire5-expand1x1_w           : (2, 128, 99, 165)    ------- (op: Conv)
INFO net.py: 249: fire5-expand1x1_w           : (2, 128, 99, 165)    => fire5-expand1x1_b           : (2, 128, 99, 165)    ------- (op: Relu)
INFO net.py: 249: fire5-squeeze1x1_b          : (2, 32, 99, 165)     => fire5-expand3x3_w           : (2, 128, 99, 165)    ------- (op: Conv)
INFO net.py: 249: fire5-expand3x3_w           : (2, 128, 99, 165)    => fire5-expand3x3_b           : (2, 128, 99, 165)    ------- (op: Relu)
INFO net.py: 249: fire5-expand1x1_b           : (2, 128, 99, 165)    => fire5-concat                : (2, 256, 99, 165)    ------- (op: Concat)
INFO net.py: 249: fire5-expand3x3_b           : (2, 128, 99, 165)    => fire5-concat                : (2, 256, 99, 165)    ------|
INFO net.py: 249: fire5-concat                : (2, 256, 99, 165)    => pool5                       : (2, 256, 49, 82)     ------- (op: MaxPool)
INFO net.py: 249: pool5                       : (2, 256, 49, 82)     => fire6-squeeze1x1_w          : (2, 48, 49, 82)      ------- (op: Conv)
INFO net.py: 249: fire6-squeeze1x1_w          : (2, 48, 49, 82)      => fire6-squeeze1x1_b          : (2, 48, 49, 82)      ------- (op: Relu)
INFO net.py: 249: fire6-squeeze1x1_b          : (2, 48, 49, 82)      => fire6-expand1x1_w           : (2, 192, 49, 82)     ------- (op: Conv)
INFO net.py: 249: fire6-expand1x1_w           : (2, 192, 49, 82)     => fire6-expand1x1_b           : (2, 192, 49, 82)     ------- (op: Relu)
INFO net.py: 249: fire6-squeeze1x1_b          : (2, 48, 49, 82)      => fire6-expand3x3_w           : (2, 192, 49, 82)     ------- (op: Conv)
INFO net.py: 249: fire6-expand3x3_w           : (2, 192, 49, 82)     => fire6-expand3x3_b           : (2, 192, 49, 82)     ------- (op: Relu)
INFO net.py: 249: fire6-expand1x1_b           : (2, 192, 49, 82)     => fire6-concat                : (2, 384, 49, 82)     ------- (op: Concat)
INFO net.py: 249: fire6-expand3x3_b           : (2, 192, 49, 82)     => fire6-concat                : (2, 384, 49, 82)     ------|
INFO net.py: 249: fire6-concat                : (2, 384, 49, 82)     => fire7-squeeze1x1_w          : (2, 48, 49, 82)      ------- (op: Conv)
INFO net.py: 249: fire7-squeeze1x1_w          : (2, 48, 49, 82)      => fire7-squeeze1x1_b          : (2, 48, 49, 82)      ------- (op: Relu)
INFO net.py: 249: fire7-squeeze1x1_b          : (2, 48, 49, 82)      => fire7-expand1x1_w           : (2, 192, 49, 82)     ------- (op: Conv)
INFO net.py: 249: fire7-expand1x1_w           : (2, 192, 49, 82)     => fire7-expand1x1_b           : (2, 192, 49, 82)     ------- (op: Relu)
INFO net.py: 249: fire7-squeeze1x1_b          : (2, 48, 49, 82)      => fire7-expand3x3_w           : (2, 192, 49, 82)     ------- (op: Conv)
INFO net.py: 249: fire7-expand3x3_w           : (2, 192, 49, 82)     => fire7-expand3x3_b           : (2, 192, 49, 82)     ------- (op: Relu)
INFO net.py: 249: fire7-expand1x1_b           : (2, 192, 49, 82)     => fire7-concat                : (2, 384, 49, 82)     ------- (op: Concat)
INFO net.py: 249: fire7-expand3x3_b           : (2, 192, 49, 82)     => fire7-concat                : (2, 384, 49, 82)     ------|
INFO net.py: 249: fire7-concat                : (2, 384, 49, 82)     => fire8-squeeze1x1_w          : (2, 64, 49, 82)      ------- (op: Conv)
INFO net.py: 249: fire8-squeeze1x1_w          : (2, 64, 49, 82)      => fire8-squeeze1x1_b          : (2, 64, 49, 82)      ------- (op: Relu)
INFO net.py: 249: fire8-squeeze1x1_b          : (2, 64, 49, 82)      => fire8-expand1x1_w           : (2, 256, 49, 82)     ------- (op: Conv)
INFO net.py: 249: fire8-expand1x1_w           : (2, 256, 49, 82)     => fire8-expand1x1_b           : (2, 256, 49, 82)     ------- (op: Relu)
INFO net.py: 249: fire8-squeeze1x1_b          : (2, 64, 49, 82)      => fire8-expand3x3_w           : (2, 256, 49, 82)     ------- (op: Conv)
INFO net.py: 249: fire8-expand3x3_w           : (2, 256, 49, 82)     => fire8-expand3x3_b           : (2, 256, 49, 82)     ------- (op: Relu)
INFO net.py: 249: fire8-expand1x1_b           : (2, 256, 49, 82)     => fire8-concat                : (2, 512, 49, 82)     ------- (op: Concat)
INFO net.py: 249: fire8-expand3x3_b           : (2, 256, 49, 82)     => fire8-concat                : (2, 512, 49, 82)     ------|
INFO net.py: 249: fire8-concat                : (2, 512, 49, 82)     => fire9-squeeze1x1_w          : (2, 64, 49, 82)      ------- (op: Conv)
INFO net.py: 249: fire9-squeeze1x1_w          : (2, 64, 49, 82)      => fire9-squeeze1x1_b          : (2, 64, 49, 82)      ------- (op: Relu)
INFO net.py: 249: fire9-squeeze1x1_b          : (2, 64, 49, 82)      => fire9-expand1x1_w           : (2, 256, 49, 82)     ------- (op: Conv)
INFO net.py: 249: fire9-expand1x1_w           : (2, 256, 49, 82)     => fire9-expand1x1_b           : (2, 256, 49, 82)     ------- (op: Relu)
INFO net.py: 249: fire9-squeeze1x1_b          : (2, 64, 49, 82)      => fire9-expand3x3_w           : (2, 256, 49, 82)     ------- (op: Conv)
INFO net.py: 249: fire9-expand3x3_w           : (2, 256, 49, 82)     => fire9-expand3x3_b           : (2, 256, 49, 82)     ------- (op: Relu)
INFO net.py: 249: fire9-expand1x1_b           : (2, 256, 49, 82)     => fire9-concat                : (2, 512, 49, 82)     ------- (op: Concat)
INFO net.py: 249: fire9-expand3x3_b           : (2, 256, 49, 82)     => fire9-concat                : (2, 512, 49, 82)     ------|
INFO net.py: 249: fire9-concat                : (2, 512, 49, 82)     => conv_rpn                    : (2, 512, 49, 82)     ------- (op: Conv)
INFO net.py: 249: conv_rpn                    : (2, 512, 49, 82)     => conv_rpn                    : (2, 512, 49, 82)     ------- (op: Relu)
INFO net.py: 249: conv_rpn                    : (2, 512, 49, 82)     => rpn_cls_logits              : (2, 12, 49, 82)      ------- (op: Conv)
INFO net.py: 249: conv_rpn                    : (2, 512, 49, 82)     => rpn_bbox_pred               : (2, 48, 49, 82)      ------- (op: Conv)
INFO net.py: 249: rpn_cls_logits              : (2, 12, 49, 82)      => rpn_cls_probs               : (2, 12, 49, 82)      ------- (op: Sigmoid)
INFO net.py: 249: rpn_cls_probs               : (2, 12, 49, 82)      => rpn_rois                    : (318, 5)             ------- (op: Python:GenerateProposalsOp:rpn_cls_probs,rpn_bbox_pred,im_info)
INFO net.py: 249: rpn_bbox_pred               : (2, 48, 49, 82)      => rpn_rois                    : (318, 5)             ------|
INFO net.py: 249: im_info                     : (2, 3)               => rpn_rois                    : (318, 5)             ------|
INFO net.py: 249: rpn_rois                    : (318, 5)             => rois                        : (322, 5)             ------- (op: Python:GenerateProposalLabelsOp:rpn_rois,roidb,im_info)
INFO net.py: 249: roidb                       : (7028,)              => rois                        : (322, 5)             ------|
INFO net.py: 249: im_info                     : (2, 3)               => rois                        : (322, 5)             ------|
INFO net.py: 249: rpn_labels_int32_wide       : (2, 12, 84, 84)      => rpn_labels_int32            : (2, 12, 49, 82)      ------- (op: SpatialNarrowAs)
INFO net.py: 249: rpn_cls_logits              : (2, 12, 49, 82)      => rpn_labels_int32            : (2, 12, 49, 82)      ------|
INFO net.py: 249: rpn_bbox_targets_wide       : (2, 48, 84, 84)      => rpn_bbox_targets            : (2, 48, 49, 82)      ------- (op: SpatialNarrowAs)
INFO net.py: 249: rpn_bbox_pred               : (2, 48, 49, 82)      => rpn_bbox_targets            : (2, 48, 49, 82)      ------|
INFO net.py: 249: rpn_bbox_inside_weights_wide: (2, 48, 84, 84)      => rpn_bbox_inside_weights     : (2, 48, 49, 82)      ------- (op: SpatialNarrowAs)
INFO net.py: 249: rpn_bbox_pred               : (2, 48, 49, 82)      => rpn_bbox_inside_weights     : (2, 48, 49, 82)      ------|
INFO net.py: 249: rpn_bbox_outside_weights_wide: (2, 48, 84, 84)      => rpn_bbox_outside_weights    : (2, 48, 49, 82)      ------- (op: SpatialNarrowAs)
INFO net.py: 249: rpn_bbox_pred               : (2, 48, 49, 82)      => rpn_bbox_outside_weights    : (2, 48, 49, 82)      ------|
INFO net.py: 249: rpn_cls_logits              : (2, 12, 49, 82)      => loss_rpn_cls                : ()                   ------- (op: SigmoidCrossEntropyLoss)
INFO net.py: 249: rpn_labels_int32            : (2, 12, 49, 82)      => loss_rpn_cls                : ()                   ------|
INFO net.py: 249: rpn_bbox_pred               : (2, 48, 49, 82)      => loss_rpn_bbox               : ()                   ------- (op: SmoothL1Loss)
INFO net.py: 249: rpn_bbox_targets            : (2, 48, 49, 82)      => loss_rpn_bbox               : ()                   ------|
INFO net.py: 249: rpn_bbox_inside_weights     : (2, 48, 49, 82)      => loss_rpn_bbox               : ()                   ------|
INFO net.py: 249: rpn_bbox_outside_weights    : (2, 48, 49, 82)      => loss_rpn_bbox               : ()                   ------|
INFO net.py: 249: fire9-concat                : (2, 512, 49, 82)     => roi_feat                    : (322, 512, 7, 7)     ------- (op: RoIAlign)
INFO net.py: 249: rois                        : (322, 5)             => roi_feat                    : (322, 512, 7, 7)     ------|
INFO net.py: 249: roi_feat                    : (322, 512, 7, 7)     => fc6                         : (322, 1024)          ------- (op: FC)
INFO net.py: 249: fc6                         : (322, 1024)          => fc6                         : (322, 1024)          ------- (op: Relu)
INFO net.py: 249: fc6                         : (322, 1024)          => fc7                         : (322, 1024)          ------- (op: FC)
INFO net.py: 249: fc7                         : (322, 1024)          => fc7                         : (322, 1024)          ------- (op: Relu)
INFO net.py: 249: fc7                         : (322, 1024)          => cls_score                   : (322, 81)            ------- (op: FC)
INFO net.py: 249: fc7                         : (322, 1024)          => bbox_pred                   : (322, 324)           ------- (op: FC)
INFO net.py: 249: cls_score                   : (322, 81)            => cls_prob                    : (322, 81)            ------- (op: SoftmaxWithLoss)
INFO net.py: 249: labels_int32                : (322,)               => cls_prob                    : (322, 81)            ------|
INFO net.py: 249: bbox_pred                   : (322, 324)           => loss_bbox                   : ()                   ------- (op: SmoothL1Loss)
INFO net.py: 249: bbox_targets                : (322, 324)           => loss_bbox                   : ()                   ------|
INFO net.py: 249: bbox_inside_weights         : (322, 324)           => loss_bbox                   : ()                   ------|
INFO net.py: 249: bbox_outside_weights        : (322, 324)           => loss_bbox                   : ()                   ------|
INFO net.py: 249: cls_prob                    : (322, 81)            => accuracy_cls                : ()                   ------- (op: Accuracy)
INFO net.py: 249: labels_int32                : (322,)               => accuracy_cls                : ()                   ------|
INFO net.py: 249: fire9-concat                : (2, 512, 49, 82)     => _[mask]_roi_feat            : (18, 512, 7, 7)      ------- (op: RoIAlign)
INFO net.py: 249: mask_rois                   : (18, 5)              => _[mask]_roi_feat            : (18, 512, 7, 7)      ------|
INFO net.py: 249: _[mask]_roi_feat            : (18, 512, 7, 7)      => _[mask]_fcn1                : (18, 256, 7, 7)      ------- (op: Conv)
INFO net.py: 249: _[mask]_fcn1                : (18, 256, 7, 7)      => _[mask]_fcn1                : (18, 256, 7, 7)      ------- (op: Relu)
INFO net.py: 249: _[mask]_fcn1                : (18, 256, 7, 7)      => _[mask]_fcn2                : (18, 256, 7, 7)      ------- (op: Conv)
INFO net.py: 249: _[mask]_fcn2                : (18, 256, 7, 7)      => _[mask]_fcn2                : (18, 256, 7, 7)      ------- (op: Relu)
INFO net.py: 249: _[mask]_fcn2                : (18, 256, 7, 7)      => _[mask]_fcn3                : (18, 256, 7, 7)      ------- (op: Conv)
INFO net.py: 249: _[mask]_fcn3                : (18, 256, 7, 7)      => _[mask]_fcn3                : (18, 256, 7, 7)      ------- (op: Relu)
INFO net.py: 249: _[mask]_fcn3                : (18, 256, 7, 7)      => _[mask]_fcn4                : (18, 256, 7, 7)      ------- (op: Conv)
INFO net.py: 249: _[mask]_fcn4                : (18, 256, 7, 7)      => _[mask]_fcn4                : (18, 256, 7, 7)      ------- (op: Relu)
INFO net.py: 249: _[mask]_fcn4                : (18, 256, 7, 7)      => conv5_mask                  : (18, 256, 14, 14)    ------- (op: ConvTranspose)
INFO net.py: 249: conv5_mask                  : (18, 256, 14, 14)    => conv5_mask                  : (18, 256, 14, 14)    ------- (op: Relu)
INFO net.py: 249: conv5_mask                  : (18, 256, 14, 14)    => mask_fcn_logits             : (18, 81, 14, 14)     ------- (op: Conv)
INFO net.py: 249: mask_fcn_logits             : (18, 81, 14, 14)     => mask_fcn_logits_up          : (18, 81, 28, 28)     ------- (op: ConvTranspose)
INFO net.py: 249: mask_fcn_logits_up          : (18, 81, 28, 28)     => loss_mask                   : ()                   ------- (op: SigmoidCrossEntropyLoss)
INFO net.py: 249: masks_int32                 : (18, 63504)          => loss_mask                   : ()                   ------|
INFO net.py: 253: End of model: generalized_rcnn

Also, after training all the APs (0.5, 0.75, etc) are zero. Something still has to be off. By the way, this is the config I am using (printed at the beginning of training): Do you recognize something that seems to be off?

 {'BBOX_XFORM_CLIP': 4.135166556742356,
 'CLUSTER': {'ON_CLUSTER': False},
 'DATA_LOADER': {'BLOBS_QUEUE_CAPACITY': 8,
                 'MINIBATCH_QUEUE_SIZE': 64,
                 'NUM_THREADS': 4},
 'DEDUP_BOXES': 0.0625,
 'DOWNLOAD_CACHE': '/tmp/detectron-download-cache',
 'EPS': 1e-14,
 'EXPECTED_RESULTS': [],
 'EXPECTED_RESULTS_ATOL': 0.005,
 'EXPECTED_RESULTS_EMAIL': '',
 'EXPECTED_RESULTS_RTOL': 0.1,
 'FAST_RCNN': {'CONV_HEAD_DIM': 256,
               'MLP_HEAD_DIM': 1024,
               'NUM_STACKED_CONVS': 4,
               'ROI_BOX_HEAD': 'fast_rcnn_heads.add_roi_2mlp_head',
               'ROI_XFORM_METHOD': 'RoIAlign',
               'ROI_XFORM_RESOLUTION': 7,
               'ROI_XFORM_SAMPLING_RATIO': 2},
 'FPN': {'COARSEST_STRIDE': 32,
         'DIM': 256,
         'EXTRA_CONV_LEVELS': False,
         'FPN_ON': False,
         'MULTILEVEL_ROIS': False,
         'MULTILEVEL_RPN': False,
         'ROI_CANONICAL_LEVEL': 4,
         'ROI_CANONICAL_SCALE': 224,
         'ROI_MAX_LEVEL': 5,
         'ROI_MIN_LEVEL': 2,
         'RPN_ANCHOR_START_SIZE': 32,
         'RPN_ASPECT_RATIOS': (0.5, 1, 2),
         'RPN_MAX_LEVEL': 6,
         'RPN_MIN_LEVEL': 2,
         'USE_GN': False,
         'ZERO_INIT_LATERAL': False},
 'GROUP_NORM': {'DIM_PER_GP': -1, 'EPSILON': 1e-05, 'NUM_GROUPS': 32},
 'KRCNN': {'CONV_HEAD_DIM': 256,
           'CONV_HEAD_KERNEL': 3,
           'CONV_INIT': 'GaussianFill',
           'DECONV_DIM': 256,
           'DECONV_KERNEL': 4,
           'DILATION': 1,
           'HEATMAP_SIZE': -1,
           'INFERENCE_MIN_SIZE': 0,
           'KEYPOINT_CONFIDENCE': 'bbox',
           'LOSS_WEIGHT': 1.0,
           'MIN_KEYPOINT_COUNT_FOR_VALID_MINIBATCH': 20,
           'NMS_OKS': False,
           'NORMALIZE_BY_VISIBLE_KEYPOINTS': True,
           'NUM_KEYPOINTS': -1,
           'NUM_STACKED_CONVS': 8,
           'ROI_KEYPOINTS_HEAD': '',
           'ROI_XFORM_METHOD': 'RoIAlign',
           'ROI_XFORM_RESOLUTION': 7,
           'ROI_XFORM_SAMPLING_RATIO': 0,
           'UP_SCALE': -1,
           'USE_DECONV': False,
           'USE_DECONV_OUTPUT': False},
 'MATLAB': 'matlab',
 'MEMONGER': True,
 'MEMONGER_SHARE_ACTIVATIONS': False,
 'MODEL': {'BBOX_REG_WEIGHTS': (10.0, 10.0, 5.0, 5.0),
           'CLS_AGNOSTIC_BBOX_REG': False,
           'CONV_BODY': 'squeezenet_adapted.add_squeezenet_conv5_body',
           'EXECUTION_TYPE': 'dag',
           'FASTER_RCNN': True,
           'KEYPOINTS_ON': False,
           'MASK_ON': True,
           'NUM_CLASSES': 81,
           'RPN_ONLY': False,
           'TYPE': 'generalized_rcnn'},
 'MRCNN': {'CLS_SPECIFIC_MASK': True,
           'CONV_INIT': 'MSRAFill',
           'DILATION': 1,
           'DIM_REDUCED': 256,
           'RESOLUTION': 28,
           'ROI_MASK_HEAD': 'mask_rcnn_heads.mask_rcnn_fcn_head_v1up4convs',
           'ROI_XFORM_METHOD': 'RoIAlign',
           'ROI_XFORM_RESOLUTION': 7,
           'ROI_XFORM_SAMPLING_RATIO': 2,
           'THRESH_BINARIZE': 0.5,
           'UPSAMPLE_RATIO': 2,
           'USE_FC_OUTPUT': False,
           'WEIGHT_LOSS_MASK': 1.0},
 'NUM_GPUS': 1,
 'OUTPUT_DIR': '/home/thiemi/Detectron/detectron/train/squeezenet/config5_droplayer',
 'PIXEL_MEANS': array([[[102.9801, 115.9465, 122.7717]]]),
 'RESNETS': {'NUM_GROUPS': 1,
             'RES5_DILATION': 1,
             'SHORTCUT_FUNC': 'basic_bn_shortcut',
             'STEM_FUNC': 'basic_bn_stem',
             'STRIDE_1X1': True,
             'TRANS_FUNC': 'bottleneck_transformation',
             'WIDTH_PER_GROUP': 64},
 'RETINANET': {'ANCHOR_SCALE': 4,
               'ASPECT_RATIOS': (0.5, 1.0, 2.0),
               'BBOX_REG_BETA': 0.11,
               'BBOX_REG_WEIGHT': 1.0,
               'CLASS_SPECIFIC_BBOX': False,
               'INFERENCE_TH': 0.05,
               'LOSS_ALPHA': 0.25,
               'LOSS_GAMMA': 2.0,
               'NEGATIVE_OVERLAP': 0.4,
               'NUM_CONVS': 4,
               'POSITIVE_OVERLAP': 0.5,
               'PRE_NMS_TOP_N': 1000,
               'PRIOR_PROB': 0.01,
               'RETINANET_ON': False,
               'SCALES_PER_OCTAVE': 3,
               'SHARE_CLS_BBOX_TOWER': False,
               'SOFTMAX': False},
 'RFCN': {'PS_GRID_SIZE': 3},
 'RNG_SEED': 3,
 'ROOT_DIR': '/home/thiemi/Detectron/detectron',
 'RPN': {'ASPECT_RATIOS': (0.5, 1, 2),
         'RPN_ON': True,
         'SIZES': (64, 128, 256, 512),
         'STRIDE': 16},
 'SOLVER': {'BASE_LR': 0.02,
            'GAMMA': 0.1,
            'LOG_LR_CHANGE_THRESHOLD': 1.1,
            'LRS': [],
            'LR_POLICY': 'steps_with_decay',
            'MAX_ITER': 90000,
            'MOMENTUM': 0.9,
            'SCALE_MOMENTUM': True,
            'SCALE_MOMENTUM_THRESHOLD': 1.1,
            'STEPS': [0, 60000, 80000],
            'STEP_SIZE': 30000,
            'WARM_UP_FACTOR': 0.3333333333333333,
            'WARM_UP_ITERS': 500,
            'WARM_UP_METHOD': u'linear',
            'WEIGHT_DECAY': 0.0001,
            'WEIGHT_DECAY_GN': 0.0},
 'TEST': {'BBOX_AUG': {'AREA_TH_HI': 32400,
                       'AREA_TH_LO': 2500,
                       'ASPECT_RATIOS': (),
                       'ASPECT_RATIO_H_FLIP': False,
                       'COORD_HEUR': 'UNION',
                       'ENABLED': False,
                       'H_FLIP': False,
                       'MAX_SIZE': 4000,
                       'SCALES': (),
                       'SCALE_H_FLIP': False,
                       'SCALE_SIZE_DEP': False,
                       'SCORE_HEUR': 'UNION'},
          'BBOX_REG': True,
          'BBOX_VOTE': {'ENABLED': False,
                        'SCORING_METHOD': 'ID',
                        'SCORING_METHOD_BETA': 1.0,
                        'VOTE_TH': 0.8},
          'COMPETITION_MODE': True,
          'DATASETS': ('coco_2014_minival',),
          'DETECTIONS_PER_IM': 100,
          'FORCE_JSON_DATASET_EVAL': False,
          'KPS_AUG': {'AREA_TH': 32400,
                      'ASPECT_RATIOS': (),
                      'ASPECT_RATIO_H_FLIP': False,
                      'ENABLED': False,
                      'HEUR': 'HM_AVG',
                      'H_FLIP': False,
                      'MAX_SIZE': 4000,
                      'SCALES': (),
                      'SCALE_H_FLIP': False,
                      'SCALE_SIZE_DEP': False},
          'MASK_AUG': {'AREA_TH': 32400,
                       'ASPECT_RATIOS': (),
                       'ASPECT_RATIO_H_FLIP': False,
                       'ENABLED': False,
                       'HEUR': 'SOFT_AVG',
                       'H_FLIP': False,
                       'MAX_SIZE': 4000,
                       'SCALES': (),
                       'SCALE_H_FLIP': False,
                       'SCALE_SIZE_DEP': False},
          'MAX_SIZE': 1333,
          'NMS': 0.5,
          'PRECOMPUTED_PROPOSALS': False,
          'PROPOSAL_FILES': (),
          'PROPOSAL_LIMIT': 2000,
          'RPN_MIN_SIZE': 0,
          'RPN_NMS_THRESH': 0.7,
          'RPN_POST_NMS_TOP_N': 1000,
          'RPN_PRE_NMS_TOP_N': 1000,
          'SCALE': 800,
          'SCORE_THRESH': 0.05,
          'SOFT_NMS': {'ENABLED': False, 'METHOD': 'linear', 'SIGMA': 0.5},
          'WEIGHTS': ''},
 'TRAIN': {'ASPECT_GROUPING': True,
           'AUTO_RESUME': True,
           'BATCH_SIZE_PER_IM': 512,
           'BBOX_THRESH': 0.5,
           'BG_THRESH_HI': 0.5,
           'BG_THRESH_LO': 0.0,
           'CROWD_FILTER_THRESH': 0.7,
           'DATASETS': ('coco_2014_train',),
           'FG_FRACTION': 0.25,
           'FG_THRESH': 0.5,
           'FREEZE_AT': 2,
           'FREEZE_CONV_BODY': False,
           'GT_MIN_AREA': -1,
           'IMS_PER_BATCH': 2,
           'MAX_SIZE': 1333,
           'PROPOSAL_FILES': (),
           'RPN_BATCH_SIZE_PER_IM': 256,
           'RPN_FG_FRACTION': 0.5,
           'RPN_MIN_SIZE': 0,
           'RPN_NEGATIVE_OVERLAP': 0.3,
           'RPN_NMS_THRESH': 0.7,
           'RPN_POSITIVE_OVERLAP': 0.7,
           'RPN_POST_NMS_TOP_N': 2000,
           'RPN_PRE_NMS_TOP_N': 2000,
           'RPN_STRADDLE_THRESH': 0,
           'SCALES': (800,),
           'SNAPSHOT_ITERS': 20000,
           'USE_FLIPPED': True,
           'WEIGHTS': '/home/thiemi/Detectron/detectron/models/squeezenet/squeezenet_model.pkl'},
 'USE_NCCL': False,
 'VIS': False,
 'VIS_TH': 0.9}

This is my implementation of squeezenet right now:

def add_squeezenet_conv5_body(model):

    #kernel size=3, num_output=64, pad=0, stride=2
    model.Conv('data', 'conv1', 3, 64, kernel=3, stride=2)
    model.Relu('conv1', 'relu_conv1')
    model.MaxPool('relu_conv1', 'pool1', stride=2, kernel=3)
    #kernel=1, num_output=16
    model.Conv('pool1', 'fire2-squeeze1x1_w', 64, 16, kernel=1, pad=0)
    model.Relu('fire2-squeeze1x1_w', 'fire2-sqeeze1x1_b')

    #divide
    model.Conv('fire2-sqeeze1x1_b', 'fire2-expand1x1_w', 16, 64, kernel=1, stride=1)
    model.Relu('fire2-expand1x1_w', 'fire2-expand1x1_b')
    model.Conv('fire2-sqeeze1x1_b', 'fire2-expand3x3_w', 16, 64, kernel=3, pad=1 )
    model.Relu('fire2-expand3x3_w', 'fire2-expand3x3_b')
    model.Concat(['fire2-expand1x1_b', 'fire2-expand3x3_b'], 'fire2-concat')

    model.Conv('fire2-concat', 'fire3-squeeze1x1_w', 128, 16, kernel=1)
    model.Relu('fire3-squeeze1x1_w', 'fire3-squeeze1x1_b')

    #divide
    model.Conv('fire3-squeeze1x1_b', 'fire3-expand1x1_w', 16, 64, kernel=1)
    model.Relu('fire3-expand1x1_w', 'fire3-expand1x1_b')

    model.Conv('fire3-squeeze1x1_b', 'fire3-expand3x3_w', 16, 64, kernel=3, pad=1)
    model.Relu('fire3-expand3x3_w', 'fire3-expand3x3_b')
    model.Concat(['fire3-expand1x1_b' , 'fire3-expand3x3_b'], 'fire3-concat')

    model.MaxPool('fire3-concat', 'pool3', kernel=3, stride=2)
    model.Conv('pool3', 'fire4-squeeze1x1_w', 128, 32, kernel=1)
    model.Relu('fire4-squeeze1x1_w', 'fire4-squeeze1x1_b')

    #divide
    model.Conv('fire4-squeeze1x1_b', 'fire4-expand1x1_w', 32, 128, kernel=1)
    model.Relu('fire4-expand1x1_w', 'fire4-expand1x1_b')

    model.Conv('fire4-squeeze1x1_b', 'fire4-expand3x3_w', 32, 128, pad=1, kernel=3)
    model.Relu('fire4-expand3x3_w', 'fire4-expand3x3_b')
    model.Concat(['fire4-expand1x1_b', 'fire4-expand3x3_b'], 'fire4-concat')

    #fire5
    model.Conv('fire4-concat', 'fire5-squeeze1x1_w', 256, 32, kernel=1)
    model.Relu('fire5-squeeze1x1_w', 'fire5-squeeze1x1_b')

    #divide
    model.Conv('fire5-squeeze1x1_b', 'fire5-expand1x1_w', 32, 128, kernel=1)
    model.Relu('fire5-expand1x1_w', 'fire5-expand1x1_b')

    model.Conv('fire5-squeeze1x1_b', 'fire5-expand3x3_w', 32, 128, pad=1, kernel=3)
    model.Relu('fire5-expand3x3_w', 'fire5-expand3x3_b')
    model.Concat(['fire5-expand1x1_b', 'fire5-expand3x3_b'], 'fire5-concat')
    model.MaxPool('fire5-concat', 'pool5', kernel=3, stride=2)

    #fire6
    model.Conv('pool5', 'fire6-squeeze1x1_w', 256, 48, kernel=1)
    model.Relu('fire6-squeeze1x1_w', 'fire6-squeeze1x1_b')

    #divide
    model.Conv('fire6-squeeze1x1_b', 'fire6-expand1x1_w', 48, 192, kernel=1)
    model.Relu('fire6-expand1x1_w', 'fire6-expand1x1_b')

    model.Conv('fire6-squeeze1x1_b', 'fire6-expand3x3_w', 48, 192, pad=1, kernel=3)
    model.Relu('fire6-expand3x3_w', 'fire6-expand3x3_b')
    model.Concat(['fire6-expand1x1_b', 'fire6-expand3x3_b'], 'fire6-concat')

    #fire7
    model.Conv('fire6-concat', 'fire7-squeeze1x1_w', 384, 48, kernel=1)
    model.Relu('fire7-squeeze1x1_w', 'fire7-squeeze1x1_b')

    #divide
    model.Conv('fire7-squeeze1x1_b', 'fire7-expand1x1_w', 48, 192, kernel=1)
    model.Relu('fire7-expand1x1_w', 'fire7-expand1x1_b')

    model.Conv('fire7-squeeze1x1_b', 'fire7-expand3x3_w', 48, 192, pad=1, kernel=3)
    model.Relu('fire7-expand3x3_w', 'fire7-expand3x3_b')
    model.Concat(['fire7-expand1x1_b', 'fire7-expand3x3_b'], 'fire7-concat')

    #fire8
    model.Conv('fire7-concat', 'fire8-squeeze1x1_w', 384, 64, kernel=1)
    model.Relu('fire8-squeeze1x1_w', 'fire8-squeeze1x1_b')

    #divide
    model.Conv('fire8-squeeze1x1_b', 'fire8-expand1x1_w', 64, 256, kernel=1)
    model.Relu('fire8-expand1x1_w', 'fire8-expand1x1_b')

    model.Conv('fire8-squeeze1x1_b', 'fire8-expand3x3_w', 64, 256, pad=1, kernel=3)
    model.Relu('fire8-expand3x3_w', 'fire8-expand3x3_b')
    model.Concat(['fire8-expand1x1_b', 'fire8-expand3x3_b'], 'fire8-concat')

    #fire9
    model.Conv('fire8-concat', 'fire9-squeeze1x1_w', 512, 64, kernel=1)
    model.Relu('fire9-squeeze1x1_w', 'fire9-squeeze1x1_b')

    #divide
    model.Conv('fire9-squeeze1x1_b', 'fire9-expand1x1_w', 64, 256, kernel=1)
    model.Relu('fire9-expand1x1_w', 'fire9-expand1x1_b')

    model.Conv('fire9-squeeze1x1_b', 'fire9-expand3x3_w', 64, 256, pad=1, kernel=3)
    model.Relu('fire9-expand3x3_w', 'fire9-expand3x3_b')
    blob_out = model.Concat(['fire9-expand1x1_b', 'fire9-expand3x3_b'], 'fire9-concat')

    #model.Dropout('fire9-concat', 'drop9', is_test=0)
    #model.Conv('drop9', 'conv10', 512, 1000, kernel=1)
    #model.Relu('conv10', 'relu_conv10')
    #model.AveragePool('relu_conv10', 'pool10', global_pooling=True)
    #blob_out = model.Softmax('pool10', 'prob')
    return blob_out, 512, 1./16. #512

I think my main problem actually is that the pretrained weights are not being found. But after renaming the layers to those demanded according to the terminal output, it still cannot find the layers (new ones are named that were not even mentioned before)

liuliu66 commented 6 years ago

@johannathiemich Hi, I think I found some issues that might be concerned to your problem.

firstly, you'd better not write your layers' name as 'xxx_w' or 'xxx_b'. Because '_w' and '_b' are the suffix generated after network initialization. for example, if you write a conv layer model.Conv('data', x) with input x and output y, when the caffe2 initialize the network, it would create two blobs called 'x_w' and 'x_b' automatically. So you do not need to write 'xxx_w' and 'xxx_b'. if not, 'fire5-expand3x3_w_w not found' in the log with double '_w_w' as you can see.

secondly, the layer names in the pkl file contain the character '\' but this could not be used in Detectron so you need to change it to '-' or '_' whatever you like and make sure the name could match them in your network. for example, 'fire2-squeeze1x1_w_w not found' in your log but in pkl its name is 'fire2/squeeze1x1_w'. so you can use cPickle package to change the keys and restore the pkl file.

Here is part of my own wrote inception-resnet network you can reference.

model.Conv('pool1_3x3_s2', 'conv4_3x3_reduce', 64, 80, 1, pad=0, stride=1, no_bias=1) model.AffineChannel('conv4_3x3_reduce', 'conv4_3x3_reduce', dim=80) model.Relu('conv4_3x3_reduce', 'conv4_3x3_reduce') model.Conv('conv4_3x3_reduce', 'conv4_3x3', 80, 192, 3, pad=0, stride=1, no_bias=1) model.AffineChannel('conv4_3x3', 'conv4_3x3', dim=192) model.Relu('conv4_3x3', 'conv4_3x3')

By the way, if you use coco dataset with 81 classes for your experiment, I think the config should be ok.

liuliu66 commented 6 years ago

@johannathiemich Hi, I tried your squeezenet model for mask rcnn in my dataset and got a not bad results. due to the limited time, I just trained it for 5k iterations with 50% AP for my object segmentation. And the pretrained model could be fed. Here is part of my config and log as your reference.

MODEL: TYPE: generalized_rcnn CONV_BODY: Squeezenet.add_Squeezenet_conv5_body NUM_CLASSES: 3 FASTER_RCNN: True MASK_ON: True NUM_GPUS: 1 SOLVER: WEIGHT_DECAY: 0.0001 LR_POLICY: steps_with_decay BASE_LR: 0.001 GAMMA: 0.1 MAX_ITER: 5000 STEPS: [0, 4000, 12000] FAST_RCNN: ROI_BOX_HEAD: fast_rcnn_heads.add_roi_2mlp_head ROI_XFORM_METHOD: RoIAlign ROI_XFORM_RESOLUTION: 7 ROI_XFORM_SAMPLING_RATIO: 2 MRCNN: ROI_MASK_HEAD: mask_rcnn_heads.mask_rcnn_fcn_head_v1up4convs RESOLUTION: 28 # (output mask resolution) default 14 ROI_XFORM_METHOD: RoIAlign ROI_XFORM_RESOLUTION: 14 # default 7 ROI_XFORM_SAMPLING_RATIO: 2 # default 0 DILATION: 1 # default 2 CONV_INIT: MSRAFill # default GaussianFill TRAIN: WEIGHTS: pretrained_models/squeezenet.pkl USE_FLIPPED: True DATASETS: ('coco_2014_trainval',) SCALES: (1000,) MAX_SIZE: 1200 BATCH_SIZE_PER_IM: 256 RPN_PRE_NMS_TOP_N: 2000 # Per FPN level TEST: DATASETS: ('coco_2014_test',) SCALE: 1000 MAX_SIZE: 1200 NMS: 0.5 RPN_PRE_NMS_TOP_N: 1000 # Per FPN level RPN_POST_NMS_TOP_N: 1000 OUTPUT_DIR: . experiments/output/mask_rcnn/squeezenet

INFO roidb.py: 49: Appending horizontally-flipped training examples... INFO roidb.py: 51: Loaded dataset: coco_chimeibing_2class_trainval INFO roidb.py: 135: Filtered 0 roidb entries: 92 -> 92 INFO roidb.py: 67: Computing bounding-box regression targets... INFO roidb.py: 69: done INFO train.py: 188: 92 roidb entries INFO net.py: 59: Loading weights from: pretrained_models/squeezenet.pkl INFO net.py: 112: conv1_w was fed INFO net.py: 112: conv1_b was fed INFO net.py: 112: fire2-squeeze1x1_w was fed INFO net.py: 112: fire2-squeeze1x1_b was fed INFO net.py: 112: fire2-expand1x1_w was fed INFO net.py: 112: fire2-expand1x1_b was fed INFO net.py: 112: fire2-expand3x3_w was fed INFO net.py: 112: fire2-expand3x3_b was fed INFO net.py: 112: fire3-squeeze1x1_w was fed INFO net.py: 112: fire3-squeeze1x1_b was fed INFO net.py: 112: fire3-expand1x1_w was fed INFO net.py: 112: fire3-expand1x1_b was fed INFO net.py: 112: fire3-expand3x3_w was fed INFO net.py: 112: fire3-expand3x3_b was fed INFO net.py: 112: fire4-squeeze1x1_w was fed INFO net.py: 112: fire4-squeeze1x1_b was fed INFO net.py: 112: fire4-expand1x1_w was fed INFO net.py: 112: fire4-expand1x1_b was fed INFO net.py: 112: fire4-expand3x3_w was fed INFO net.py: 112: fire4-expand3x3_b was fed INFO net.py: 112: fire5-squeeze1x1_w was fed INFO net.py: 112: fire5-squeeze1x1_b was fed INFO net.py: 112: fire5-expand1x1_w was fed INFO net.py: 112: fire5-expand1x1_b was fed INFO net.py: 112: fire5-expand3x3_w was fed INFO net.py: 112: fire5-expand3x3_b was fed INFO net.py: 112: fire6-squeeze1x1_w was fed INFO net.py: 112: fire6-squeeze1x1_b was fed INFO net.py: 112: fire6-expand1x1_w was fed INFO net.py: 112: fire6-expand1x1_b was fed INFO net.py: 112: fire6-expand3x3_w was fed INFO net.py: 112: fire6-expand3x3_b was fed INFO net.py: 112: fire7-squeeze1x1_w was fed INFO net.py: 112: fire7-squeeze1x1_b was fed INFO net.py: 112: fire7-expand1x1_w was fed INFO net.py: 112: fire7-expand1x1_b was fed INFO net.py: 112: fire7-expand3x3_w was fed INFO net.py: 112: fire7-expand3x3_b was fed INFO net.py: 112: fire8-squeeze1x1_w was fed INFO net.py: 112: fire8-squeeze1x1_b was fed INFO net.py: 112: fire8-expand1x1_w was fed INFO net.py: 112: fire8-expand1x1_b was fed INFO net.py: 112: fire8-expand3x3_w was fed INFO net.py: 112: fire8-expand3x3_b was fed INFO net.py: 112: fire9-squeeze1x1_w was fed INFO net.py: 112: fire9-squeeze1x1_b was fed INFO net.py: 112: fire9-expand1x1_w was fed INFO net.py: 112: fire9-expand1x1_b was fed INFO net.py: 112: fire9-expand3x3_w was fed INFO net.py: 112: fire9-expand3x3_b was fed

note that 'xxx was fed' is added by me. if your blobs in pretrained file are fed into network, there would not show 'xxx not found' information

johannathiemich commented 6 years ago

@liuliu66 Thank you again very much for your help! After some more testing, everything seems to work (more or less) as expected. Now I only have one more question. The inference times for different images vary greatly within one for example infer_simple run (While using the images provided in demo) This issue is mentioned here: https://github.com/facebookresearch/Detectron/blob/master/GETTING_STARTED.md this applies to my case since the misc_mask is time actually really high sometimes. But my problem/question is: I am using the pictures from the demo so they are not those mentioned "high resolution images". They all have a resolution of about 600*900 pixel. So it does not make any sense to resize those? But what is the solution then? This is a big problem for me since I have to achieve a stable and reliable framerate at the end.

liuliu66 commented 6 years ago

@johannathiemich Hi, to be honest, I'm not sure about the solution of this problem. Because my work does not have a high detection speed requirement but only as accurate as possible. I think the inference times for different demo images depend on the resolution of original images and their number of objects. More objects the image contains would cost more time. Besides, I think it should be useful that you resize them into relative low resolution but it might lead to inaccurate result. Sorry I may not help you a lot about this problem.

johannathiemich commented 6 years ago

@liuliu66 Thank you for your answer. I actually think that that issue had to do with the input images. In the end my application will get input images from an attached camera. So those pictures will not differ in size at all. Interestingly, the inference times on those pictures seem to be stable. So I guess the issue is fixed, more or less. Again, thank you very much for all of your help!

johannathiemich commented 6 years ago

@liuliu66 Sorry that I need to concern you again. Do you have any other hints for integrating the FPN? When I just add an FPN layer onto my Squeezenet body, I get the error that the input_blob does not have the attribute len(). It seems to expect a list but I do not know how to generate one in the right format? Any help is greatly appreciated as always.

liuliu66 commented 6 years ago

@johannathiemich Hi, sorry for late response. When you want to add FPN structure on your own conv body like Squeezenet, you should define two functions in detectron/modeling/FPN.py file. The format of the functions are similar with this:

def add_fpn_ResNet50_conv5_body(model): return add_fpn_onto_conv_body( model, ResNet.add_ResNet50_conv5_body, fpn_level_info_ResNet50_conv5 )

def fpn_level_info_ResNet50_conv5(): return FpnLevelInfo( blobs=('res5_2_sum', 'res4_5_sum', 'res3_3_sum', 'res2_2_sum'), dims=(2048, 1024, 512, 256), spatial_scales=(1. / 32., 1. / 16., 1. / 8., 1. / 4.) )

You need to change the function name and the arguments name to match your Squeezenet function. For example, define a function called 'add_fpn_Squeezenet_conv5_body(model)' and 'fpn_level_info_Squeezenet_conv5'. of course you can define them with other names but make sure they could be called correctly. Then in the first function, change the name 'ResNet.add_ResNet50_conv5_body, fpn_level_info_ResNet50_conv5' to yours. In the second function, change the blobs, dims and spatial_scales to those in your conv body. make sure the dims and spatial_scales are matched to blobs. The blobs names are chosen by yourself but you'd better choose them with spatial_scales (1. / 32., 1. / 16., 1. / 8., 1. / 4.) because they are default. otherwise you need to change the code in detectron/roi_data/rpn.py.

Finally, write a yaml file similar with that wrote by author. Then I think it would work fine. Sorry I may not paste my config and codes now because I am not in my lab these days.

Hope it could help you.