facebookresearch / maskrcnn-benchmark

Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.
MIT License
9.29k stars 2.5k forks source link

Loss Weights for different parts #272

Open txytju opened 5 years ago

txytju commented 5 years ago

❓ Questions and Help

Do we support add different loss weights to RPN, mask head and box head?

I have checked maskrcnn-benchmark/trainer.py and found losses = sum(loss for loss in loss_dict.values()). So all losses from different parts are using the same weight?

fmassa commented 5 years ago

Yes, all losses use the same multiplicative factor.

It is very easy to add custom weights for each one of the branches, but we haven't added it yet.

Let me know if you'd need some pointers on how to implement it.

madurner commented 5 years ago

@fmassa I am currently fine-tuning a network based on the pre-trained model e2e_mask_rcnn_R_50_FPN_1x.pth as well as e2e_mask_rcnn_R_101_FPN_1x.pth from the [Model Zoo] (https://github.com/facebookresearch/maskrcnn-benchmark/blob/master/MODEL_ZOO.md). Since the evaluation on my test set is not really good, I am thinking about improvements. I played around with the learning rate (for single GPU from 0.0025 up to 0.01) but this did not lead to any improvements. So any other ideas are also welcome :P Well, next I was looking at the predictions qualitatively and saw, that there are a lot of cases, where nothing was detected. So, the classification as well as the segmentation are good IF a bounding box is detected. Now I am thinking of waiting the bounding box predictions more in the loss term. So

  1. Do you think this could help?
  2. I assume the bbox-detection would be effected by the loss_rpn_box_reg right? Or is it the loss_box_reg?
  3. Do I only have to change as stated above this line: https://github.com/facebookresearch/maskrcnn-benchmark/blob/b3d1de0088ad84b7a1cdee62c08418c7b9095acc/maskrcnn_benchmark/engine/trainer.py#L68
  4. What would be good scaling/weight values for the losses?
sanshibayuan commented 5 years ago

@fmassa I am currently fine-tuning a network based on the pre-trained model e2e_mask_rcnn_R_50_FPN_1x.pth as well as e2e_mask_rcnn_R_101_FPN_1x.pth from the [Model Zoo] (https://github.com/facebookresearch/maskrcnn-benchmark/blob/master/MODEL_ZOO.md). Since the evaluation on my test set is not really good, I am thinking about improvements. I played around with the learning rate (for single GPU from 0.0025 up to 0.01) but this did not lead to any improvements. So any other ideas are also welcome :P Well, next I was looking at the predictions qualitatively and saw, that there are a lot of cases, where nothing was detected. So, the classification as well as the segmentation are good IF a bounding box is detected. Now I am thinking of waiting the bounding box predictions more in the loss term. So

  1. Do you think this could help?
  2. I assume the bbox-detection would be effected by the loss_rpn_box_reg right? Or is it the loss_box_reg?
  3. Do I only have to change as stated above this line: https://github.com/facebookresearch/maskrcnn-benchmark/blob/b3d1de0088ad84b7a1cdee62c08418c7b9095acc/maskrcnn_benchmark/engine/trainer.py#L68
  4. What would be good scaling/weight values for the losses?

Hi, Any luck with the weithed loss? I am trying to enhance the mask result and I am not sure how to set the weight.

I assume that bbox-detection are related to both the loss_rpn_box_reg and the loss_box_reg, am I correct?

CarolineHaslebacher commented 2 years ago

Hi, @fmassa: I would like to implement weights for my imbalanced training dataset. I calculated the weights per category (=class) dependent on their abundance.

So weights=torch.tensor([0, 1/1035*100, 1/1258*100, 1/1014*100, 1/146*100, 1/6459*100])

This results in a weight tensor of [0.0000, 0.0966, 0.0795, 0.0986, 0.6849, 0.0155], which makes sense in my understanding, since class 4 is the one with the least features (highest weight) and class 5 is the one with the most features (lowest weight). (Category 0 is the 'empty' class and therefore has no weight). Side question: does it matter how big these numbers are? I just multiplied by 100 to make them a bit bigger, but perhaps that is not necessary?

Now I want to add this class-weights to the losses. I have a few questions:

  1. Is it a good idea to add these weights to all losses (mask_loss, box_loss, classification_loss, keypoint_loss)?
  2. How to do this for example for the mask_loss? I have added a keyword argument posweights=torch.tensor([0.0000, 0.0966, 0.0795, 0.0986, 0.6849, 0.0155]) that I thought I can pass as pos_weight in binary_cross_entropy_with_logits, but I get The size of tensor a (6) must match the size of tensor b (28) at non-singleton dimension 2. The whole mask_loss is mask_loss = F.binary_cross_entropy_with_logits( mask_logits[torch.arange(labels.shape[0], device=labels.device), labels], mask_targets, pos_weight=posweights # added by me ) Perhaps this gets solved after I get an answer to question 1. I believe that one mask has a shape of (28,28) and I would need to specify a weight for each mask pixel, which does not make sense. So perhaps I should only implement the weights per class for box_loss, classification_loss and keypoint_loss?