Open kevslinger opened 6 years ago
I have noticed similar results when using a higher learning rate during training. Which LR are you using? Could you paste your training and detection configs?
@JoeLogan1981 Thanks for the response! My LR is 0.001, rest of configs are... Configurations: BACKBONE resnet101 BACKBONE_STRIDES [4, 8, 16, 32, 64] BATCH_SIZE 2 BBOX_STD_DEV [0.1 0.1 0.2 0.2] COMPUTE_BACKBONE_SHAPE None DETECTION_MAX_INSTANCES 100 DETECTION_MIN_CONFIDENCE 0.9 DETECTION_NMS_THRESHOLD 0.3 FPN_CLASSIF_FC_LAYERS_SIZE 1024 GPU_COUNT 1 GRADIENT_CLIP_NORM 5.0 IMAGES_PER_GPU 2 IMAGE_MAX_DIM 1024 IMAGE_META_SIZE 14 IMAGE_MIN_DIM 800 IMAGE_MIN_SCALE 0 IMAGE_RESIZE_MODE square IMAGE_SHAPE [1024 1024 3] LEARNING_MOMENTUM 0.9 LEARNING_RATE 0.001 LOSS_WEIGHTS {'rpn_class_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_bbox_loss': 1.0, 'mrcnn_mask_loss': 1.0} MASK_POOL_SIZE 14 MASK_SHAPE [28, 28] MAX_GT_INSTANCES 100 MEAN_PIXEL [123.7 116.8 103.9] MINI_MASK_SHAPE (56, 56) NAME bubble NUM_CLASSES 2 POOL_SIZE 7 POST_NMS_ROIS_INFERENCE 1000 POST_NMS_ROIS_TRAINING 2000 ROI_POSITIVE_RATIO 0.33 RPN_ANCHOR_RATIOS [0.5, 1, 2] RPN_ANCHOR_SCALES (32, 64, 128, 256, 512) RPN_ANCHOR_STRIDE 1 RPN_BBOX_STD_DEV [0.1 0.1 0.2 0.2] RPN_NMS_THRESHOLD 0.7 RPN_TRAIN_ANCHORS_PER_IMAGE 256 STEPS_PER_EPOCH 100 TOP_DOWN_PYRAMID_SIZE 256 TRAIN_BN False TRAIN_ROIS_PER_IMAGE 200 USE_MINI_MASK True USE_RPN_ROIS True VALIDATION_STEPS 50 WEIGHT_DECAY 0.0001
Is there any more info I can provide that would be useful to you? Let me know. Thanks!
Hi Kevin, any chance you figured out the issue of dead neurons? I've encountered same issues..
Still got Nan loss after removing empty ROI/anchor. Lowering the learning rate fixed the problem.
Hi Kevin, any chance you figured out the issue of dead neurons? I've encountered same issues.
Hi @amangupta2303, no, I was not able to solve this issue. For my project, I was able to use a Unet to solve my task.
@kevslinger thank you for the response, It started working. I have solved the error by decreased the learning rate and started the training again and after that the model.detect was not giving empty arrays. After that I tried to make prediction on new unseen images with splash command but with the splash command their was no detection or any type of polygons or bounding boxes for the new image. Can anyone help me out here?
Hello!
Thank you to all the contributors of this project; I really appreciate having open-source tools as great as this one everyone can use.
After training, I ran through inspect_balloon_model.ipynb, and noticed my model never predicted anything.
More specifically, after running
results = model.detect([image], verbose=1)
, my results variable was{'rois': array([], shape=(0, 4), dtype=int32), 'class_ids': array([], dtype=int32), 'scores': array([], dtype=float32), 'masks': array([], shape=(1024, 1024, 0), dtype=float64)}
So I decided to inspect my weights (the output is pasted below. I notice many of the weights are dead, and I have no idea how to even begin debugging that. I trained my model again a second time to make sure it wasn't something random, and got the same dead weights. Does anyone know how to fix/go about debugging this? What more information would be useful to debug?
Thank you!