roytseng-tw / Detectron.pytorch

A pytorch implementation of Detectron. Both training from scratch and inferring directly from pretrained Detectron weights are available.
MIT License
2.82k stars 567 forks source link

loss_bbox almost equal to 0 #179

Open liuliu66 opened 5 years ago

liuliu66 commented 5 years ago

I attempt to run the faster rcnn fpn with resnet50 model on my own dataset. But I meet that the loss_bbox is almost to be zero and the final mAP is also 0 or other small abnormal number. when I train a faster rcnn without FPN it works well. Here is my yaml file and part of logs. could anybody help me solve this problem?

MODEL: TYPE: generalized_rcnn CONV_BODY: FPN.fpn_ResNet50_conv5_body NUM_CLASSES: 2 FASTER_RCNN: True NUM_GPUS: 1 SOLVER: WEIGHT_DECAY: 0.0001 LR_POLICY: steps_with_decay BASE_LR: 0.001 GAMMA: 0.1 MAX_ITER: 10000 STEPS: [0, 4000, 8000] FPN: FPN_ON: True MULTILEVEL_ROIS: True MULTILEVEL_RPN: True FAST_RCNN: ROI_BOX_HEAD: fast_rcnn_heads.roi_2mlp_head ROI_XFORM_METHOD: RoIAlign ROI_XFORM_RESOLUTION: 7 ROI_XFORM_SAMPLING_RATIO: 2 RESNETS: IMAGENET_PRETRAINED_WEIGHTS: pretrained_models/R-50.pkl TRAIN: DATASETS: ('coco_daofeishi_trainval',) SCALES: (1200,) MAX_SIZE: 1500 BATCH_SIZE_PER_IM: 256 RPN_PRE_NMS_TOP_N: 2000 # Per FPN level SNAPSHOT_ITERS: 30000 IMS_PER_BATCH: 1 USE_FLIPPED: True TEST: DATASETS: ('coco_daofeishi_test',) SCALE: 1200 MAX_SIZE: 1500 NMS: 0.5 RPN_PRE_NMS_TOP_N: 1000 # Per FPN level RPN_POST_NMS_TOP_N: 1000

[Nov12-16-53-37_iim321_step][e2e_faster_rcnn_resnet-50-FPN.yaml][Step 2721 / 5000] loss: 0.499181, lr: 0.001000 time: 0.911114, eta: 0:34:37 accuracy_cls: 0.966797 loss_cls: 0.147242, loss_bbox: 0.000006 loss_rpn_bbox: 0.004001, loss_rpn_cls: 0.323238 loss_rpn_cls_fpn3: 0.087049, loss_rpn_cls_fpn2: 0.149994, loss_rpn_cls_fpn5: 0.000925, loss_rpn_cls_fpn4: 0.006139, loss_rpn_cls_fpn6: 0.000000 loss_rpn_bbox_fpn3: 0.001847, loss_rpn_bbox_fpn2: 0.000750, loss_rpn_bbox_fpn6: 0.000000, loss_rpn_bbox_fpn5: 0.000000, loss_rpn_bbox_fpn4: 0.000000 [Nov12-16-53-37_iim321_step][e2e_faster_rcnn_resnet-50-FPN.yaml][Step 2741 / 5000] loss: 0.352934, lr: 0.001000 time: 0.911065, eta: 0:34:19 accuracy_cls: 0.980469 loss_cls: 0.105179, loss_bbox: 0.000005 loss_rpn_bbox: 0.002404, loss_rpn_cls: 0.241386 loss_rpn_cls_fpn3: 0.073445, loss_rpn_cls_fpn2: 0.132293, loss_rpn_cls_fpn5: 0.001676, loss_rpn_cls_fpn4: 0.007521, loss_rpn_cls_fpn6: 0.000000 loss_rpn_bbox_fpn3: 0.001854, loss_rpn_bbox_fpn2: 0.000000, loss_rpn_bbox_fpn6: 0.000000, loss_rpn_bbox_fpn5: 0.000000, loss_rpn_bbox_fpn4: 0.000000 [Nov12-16-53-37_iim321_step][e2e_faster_rcnn_resnet-50-FPN.yaml][Step 2761 / 5000] loss: 0.471432, lr: 0.001000 time: 0.911027, eta: 0:34:00 accuracy_cls: 0.970703 loss_cls: 0.135180, loss_bbox: 0.000005 loss_rpn_bbox: 0.004191, loss_rpn_cls: 0.327423 loss_rpn_cls_fpn3: 0.065737, loss_rpn_cls_fpn2: 0.206389, loss_rpn_cls_fpn5: 0.001667, loss_rpn_cls_fpn4: 0.005925, loss_rpn_cls_fpn6: 0.000000 loss_rpn_bbox_fpn3: 0.001326, loss_rpn_bbox_fpn2: 0.002395, loss_rpn_bbox_fpn6: 0.000000, loss_rpn_bbox_fpn5: 0.000000, loss_rpn_bbox_fpn4: 0.000000 [Nov12-16-53-37_iim321_step][e2e_faster_rcnn_resnet-50-FPN.yaml][Step 2781 / 5000] loss: 0.372895, lr: 0.001000 time: 0.910958, eta: 0:33:42 accuracy_cls: 0.978516 loss_cls: 0.110998, loss_bbox: 0.000003 loss_rpn_bbox: 0.002550, loss_rpn_cls: 0.259391 loss_rpn_cls_fpn3: 0.066516, loss_rpn_cls_fpn2: 0.185310, loss_rpn_cls_fpn5: 0.001093, loss_rpn_cls_fpn4: 0.006211, loss_rpn_cls_fpn6: 0.000000 loss_rpn_bbox_fpn3: 0.000808, loss_rpn_bbox_fpn2: 0.001618, loss_rpn_bbox_fpn6: 0.000000, loss_rpn_bbox_fpn5: 0.000000, loss_rpn_bbox_fpn4: 0.000000

liuliu66 commented 5 years ago

@roytseng-tw could you please help me to see what the problem here? thank you

ligang-cs commented 5 years ago

@liuliu66 I have encountered the same problem. Have you solved it? thank you!

frezaeix commented 5 years ago

@liuliu66, @LIszu I have the same problem. Did you solve it?