pascal_voc_detection_metrics gives very low scores for first category in label_map

MariaKalt commented 3 years ago

I am following this tutorial to train a Faster R-CNN model for object detection on my own data.

When evaluating using python model_main_tf2.py --model_dir=models/MY_MODEL --pipeline_config_path=models/MY_MODEL/pipeline.config --checkpoint_dir=models/MY_MODEL and Pascal VOC evaluator the first category in my label map is never evaluated correctly.

I have trained the modell for a different number of classes, every time the first category has very low mAP. The code I'm posting here refers to a model trained for object detection for two classes. The record files have been created with the script from the tutorial.

This is my config file:

model {
  faster_rcnn {
    num_classes: 2
    image_resizer {
      fixed_shape_resizer {
        width: 1024
        height: 1024
      }
    }
    feature_extractor {
      type: 'faster_rcnn_resnet101_keras'
      batch_norm_trainable: true
    }
    first_stage_anchor_generator {
      grid_anchor_generator {
        scales: [0.25, 0.5, 1.0, 2.0]
        aspect_ratios: [0.5, 1.0, 2.0]
        height_stride: 16
        width_stride: 16
      }
    }
    first_stage_box_predictor_conv_hyperparams {
      op: CONV
      regularizer {
        l2_regularizer {
          weight: 0.0
        }
      }
      initializer {
        truncated_normal_initializer {
          stddev: 0.01
        }
      }
    }
    first_stage_nms_score_threshold: 0.0
    first_stage_nms_iou_threshold: 0.7
    first_stage_max_proposals: 300
    first_stage_localization_loss_weight: 2.0
    first_stage_objectness_loss_weight: 1.0
    initial_crop_size: 14
    maxpool_kernel_size: 2
    maxpool_stride: 2
    second_stage_box_predictor {
      mask_rcnn_box_predictor {
        use_dropout: false
        dropout_keep_probability: 1.0
        fc_hyperparams {
          op: FC
          regularizer {
            l2_regularizer {
              weight: 0.0
            }
          }
          initializer {
            variance_scaling_initializer {
              factor: 1.0
              uniform: true
              mode: FAN_AVG
            }
          }
        }
        share_box_across_classes: true
      }
    }
    second_stage_post_processing {
      batch_non_max_suppression {
        score_threshold: 0.0
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 300
      }
      score_converter: SOFTMAX
    }
    second_stage_localization_loss_weight: 2.0
    second_stage_classification_loss_weight: 1.0
    use_static_shapes: true
    use_matmul_crop_and_resize: true
    clip_anchors_to_image: true
    use_static_balanced_label_sampler: true
    use_matmul_gather_in_matcher: true
  }
}
train_config {
  batch_size: 4
  num_steps: 6000
  sync_replicas: true
  optimizer {
    momentum_optimizer {
      learning_rate {
        cosine_decay_learning_rate {
          learning_rate_base: 0.002
          total_steps: 6000
          warmup_learning_rate: 0.0002
          warmup_steps: 300
          hold_base_rate_steps: 0
        }
      }
      momentum_optimizer_value: 0.9
    }
    use_moving_average: false
  }
  fine_tune_checkpoint: "pre-trained-models/faster_rcnn_resnet101_v1_1024x1024_coco17_tpu-8/checkpoint/ckpt-0"
  startup_delay_steps: 0.0
  replicas_to_aggregate: 8
  max_number_of_boxes: 100
  unpad_groundtruth_tensors: false
  fine_tune_checkpoint_type: "detection"
  use_bfloat16: false
  fine_tune_checkpoint_version: V2
}
train_input_reader: {
  label_map_path: "annotations/label_map.pbtxt"
  tf_record_input_reader {
    input_path: "annotations/train.record"
  }
}
eval_config: {
  metrics_set: "pascal_voc_detection_metrics"
  use_moving_averages: false
}
eval_input_reader: {
  label_map_path: "annotations/label_map.pbtxt"
  shuffle: false
  num_epochs: 1
  tf_record_input_reader {
    input_path: "annotations/test.record"
  }
}

My label_map:
`item {
    id: 1
    name: 'Signal'
}
item {
    id: 2
    name: 'Schild' # this means sign in german
}

The results from evaluation are

I0419 13:21:50.826318 140148733331264 object_detection_evaluation.py:1335] average_precision: 0.004911
I0419 13:21:50.853986 140148733331264 object_detection_evaluation.py:1335] average_precision: 0.802144
INFO:tensorflow:Eval metrics at step 6000
I0419 13:21:50.858681 140148733331264 model_lib_v2.py:988] Eval metrics at step 6000
INFO:tensorflow:        + PascalBoxes_Precision/mAP@0.5IOU: 0.403527
I0419 13:21:50.907640 140148733331264 model_lib_v2.py:991]      + PascalBoxes_Precision/mAP@0.5IOU: 0.403527
INFO:tensorflow:        + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Signal: 0.004911
I0419 13:21:50.909214 140148733331264 model_lib_v2.py:991]      + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Signal: 0.004911
INFO:tensorflow:        + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Schild: 0.802144
I0419 13:21:50.910397 140148733331264 model_lib_v2.py:991]      + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Schild: 0.802144
INFO:tensorflow:        + Loss/RPNLoss/localization_loss: 0.009531
I0419 13:21:50.911445 140148733331264 model_lib_v2.py:991]      + Loss/RPNLoss/localization_loss: 0.009531
INFO:tensorflow:        + Loss/RPNLoss/objectness_loss: 0.003197
I0419 13:21:50.912504 140148733331264 model_lib_v2.py:991]      + Loss/RPNLoss/objectness_loss: 0.003197
INFO:tensorflow:        + Loss/BoxClassifierLoss/localization_loss: 0.022416
I0419 13:21:50.913525 140148733331264 model_lib_v2.py:991]      + Loss/BoxClassifierLoss/localization_loss: 0.022416
INFO:tensorflow:        + Loss/BoxClassifierLoss/classification_loss: 0.027070
I0419 13:21:50.914626 140148733331264 model_lib_v2.py:991]      + Loss/BoxClassifierLoss/classification_loss: 0.027070
INFO:tensorflow:        + Loss/regularization_loss: 0.000000
I0419 13:21:50.915687 140148733331264 model_lib_v2.py:991]      + Loss/regularization_loss: 0.000000
INFO:tensorflow:        + Loss/total_loss: 0.062212
I0419 13:21:50.916763 140148733331264 model_lib_v2.py:991]      + Loss/total_loss: 0.062212

Experiments:

When using these categories seen below, it is Masts that have very low mAP@0.5 at 0.025
```
item {
id: 1
name: 'Mast'
}
```

item { id: 2 name: 'Signal' }

item { id: 3 name: 'Schild' }

- Using a dummy category (which has no objects in the data) as first category in the label map, all other categories are evaluated in a meaningful way (the overall mAP is of course affected by the low mAP of the dummy class)

item { id: 1 name: 'Dummy' }

item { id: 2 name: 'Mast' }

item { id: 3 name: 'Signal' }

item { id: 4 name: 'Schild' }

Evaluation results:

I0420 10:52:29.358691 140207870719808 checkpoint_utils.py:134] Found new checkpoint at models/my_faster_rcnn_resnet101_1024_test1/ckpt-5 INFO:tensorflow:Finished eval step 0 I0420 10:52:42.201465 140207870719808 model_lib_v2.py:939] Finished eval step 0 I0420 10:53:19.393145 140207870719808 object_detection_evaluation.py:1335] average_precision: 0.000000 I0420 10:53:19.398215 140207870719808 object_detection_evaluation.py:1335] average_precision: 0.198166 I0420 10:53:19.401919 140207870719808 object_detection_evaluation.py:1335] average_precision: 0.621884 I0420 10:53:19.405840 140207870719808 object_detection_evaluation.py:1335] average_precision: 0.692288 INFO:tensorflow:Eval metrics at step 1500 I0420 10:53:19.407017 140207870719808 model_lib_v2.py:988] Eval metrics at step 1500 INFO:tensorflow: + PascalBoxes_Precision/mAP@0.5IOU: 0.378084 I0420 10:53:19.435913 140207870719808 model_lib_v2.py:991] + PascalBoxes_Precision/mAP@0.5IOU: 0.378084 INFO:tensorflow: + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Dummy: 0.000000 I0420 10:53:19.436885 140207870719808 model_lib_v2.py:991] + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Dummy: 0.000000 INFO:tensorflow: + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Mast: 0.198166 I0420 10:53:19.437575 140207870719808 model_lib_v2.py:991] + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Mast: 0.198166 INFO:tensorflow: + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Signal: 0.621884 I0420 10:53:19.438186 140207870719808 model_lib_v2.py:991] + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Signal: 0.621884 INFO:tensorflow: + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Schild: 0.692288 I0420 10:53:19.438765 140207870719808 model_lib_v2.py:991] + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Schild: 0.692288 INFO:tensorflow: + Loss/RPNLoss/localization_loss: 0.038080 I0420 10:53:19.439270 140207870719808 model_lib_v2.py:991] + Loss/RPNLoss/localization_loss: 0.038080 INFO:tensorflow: + Loss/RPNLoss/objectness_loss: 0.013058 I0420 10:53:19.439779 140207870719808 model_lib_v2.py:991] + Loss/RPNLoss/objectness_loss: 0.013058 INFO:tensorflow: + Loss/BoxClassifierLoss/localization_loss: 0.036514 I0420 10:53:19.440281 140207870719808 model_lib_v2.py:991] + Loss/BoxClassifierLoss/localization_loss: 0.036514 INFO:tensorflow: + Loss/BoxClassifierLoss/classification_loss: 0.046075 I0420 10:53:19.440786 140207870719808 model_lib_v2.py:991] + Loss/BoxClassifierLoss/classification_loss: 0.046075 INFO:tensorflow: + Loss/regularization_loss: 0.000000 I0420 10:53:19.441344 140207870719808 model_lib_v2.py:991] + Loss/regularization_loss: 0.000000 INFO:tensorflow: + Loss/total_loss: 0.133727 I0420 10:53:19.441860 140207870719808 model_lib_v2.py:991] + Loss/total_loss: 0.133727


- ALL categories can be found using inference on test images
- Evaluating with COCO metrics it gives me an mAP@0.50IOU of 0.773897

By now I think this might be a bug, but I'm hoping it is a problem in my code/config/data that I can solve.

Can somebody help me to solve this issue? 
Does it have to do with category ID 1 interfering with background category (ID 0)?

e10101 commented 3 years ago

I have the same issue here. Please fix this problem.

e10101 commented 3 years ago

Hi @MariaKalt , I created a pull-request to try to fix this issue: https://github.com/tensorflow/models/pull/9956

please try the changes from that pull-request. Please let me know if that PR can fix the issue you found.

MariaKalt commented 3 years ago

Thank you for your reply! Unfortunately the changes don't seem to fix the problem in my case. What I have done:

I have trained a model with the original object_detection_evaluation.py file. The evaluation results after 1000 steps were:

I0429 08:04:35.500414 140300519327552 checkpoint_utils.py:134] Found new checkpoint at models/my_faster_rcnn_resnet101_1024_v9_test/ckpt-12
INFO:tensorflow:Finished eval step 0
I0429 08:04:57.713613 140300519327552 model_lib_v2.py:939] Finished eval step 0
I0429 08:05:34.069654 140300519327552 object_detection_evaluation.py:1335] average_precision: 0.003360
I0429 08:05:34.100008 140300519327552 object_detection_evaluation.py:1335] average_precision: 0.983333
I0429 08:05:34.130861 140300519327552 object_detection_evaluation.py:1335] average_precision: 1.000000
INFO:tensorflow:Eval metrics at step 1000
I0429 08:05:34.133393 140300519327552 model_lib_v2.py:988] Eval metrics at step 1000
INFO:tensorflow:        + PascalBoxes_Precision/mAP@0.5IOU: 0.662231
I0429 08:05:34.146322 140300519327552 model_lib_v2.py:991]      + PascalBoxes_Precision/mAP@0.5IOU: 0.662231
INFO:tensorflow:        + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Mast: 0.003360
I0429 08:05:34.146905 140300519327552 model_lib_v2.py:991]      + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Mast: 0.003360
INFO:tensorflow:        + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Signal: 0.983333
I0429 08:05:34.147211 140300519327552 model_lib_v2.py:991]      + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Signal: 0.983333
INFO:tensorflow:        + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Schild: 1.000000
I0429 08:05:34.147500 140300519327552 model_lib_v2.py:991]      + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Schild: 1.000000
INFO:tensorflow:        + Loss/RPNLoss/localization_loss: 0.018888
I0429 08:05:34.147752 140300519327552 model_lib_v2.py:991]      + Loss/RPNLoss/localization_loss: 0.018888
INFO:tensorflow:        + Loss/RPNLoss/objectness_loss: 0.009581
I0429 08:05:34.148016 140300519327552 model_lib_v2.py:991]      + Loss/RPNLoss/objectness_loss: 0.009581
INFO:tensorflow:        + Loss/BoxClassifierLoss/localization_loss: 0.028051
I0429 08:05:34.148275 140300519327552 model_lib_v2.py:991]      + Loss/BoxClassifierLoss/localization_loss: 0.028051
INFO:tensorflow:        + Loss/BoxClassifierLoss/classification_loss: 0.041030
I0429 08:05:34.148535 140300519327552 model_lib_v2.py:991]      + Loss/BoxClassifierLoss/classification_loss: 0.041030
INFO:tensorflow:        + Loss/regularization_loss: 0.000000
I0429 08:05:34.148796 140300519327552 model_lib_v2.py:991]      + Loss/regularization_loss: 0.000000
INFO:tensorflow:        + Loss/total_loss: 0.097550
I0429 08:05:34.149062 140300519327552 model_lib_v2.py:991]      + Loss/total_loss: 0.097550

After that I have added the changes to the object_detection_evaluation.py file and trained the same model. The evaluation results, again at Step 1000, are:

INFO:tensorflow:Found new checkpoint at models/my_faster_rcnn_resnet101_1024_v9_test2/ckpt-11
I0429 08:47:02.798970 139898413045568 checkpoint_utils.py:134] Found new checkpoint at models/my_faster_rcnn_resnet101_1024_v9_test2/ckpt-11
INFO:tensorflow:Finished eval step 0
I0429 08:47:23.586150 139898413045568 model_lib_v2.py:939] Finished eval step 0
I0429 08:48:00.624495 139898413045568 object_detection_evaluation.py:1335] average_precision: 0.003302
I0429 08:48:00.629477 139898413045568 object_detection_evaluation.py:1335] average_precision: 0.915684
I0429 08:48:00.644056 139898413045568 object_detection_evaluation.py:1335] average_precision: 1.000000
INFO:tensorflow:Eval metrics at step 1000
I0429 08:48:00.645452 139898413045568 model_lib_v2.py:988] Eval metrics at step 1000
INFO:tensorflow:        + PascalBoxes_Precision/mAP@0.5IOU: 0.639662
I0429 08:48:00.686012 139898413045568 model_lib_v2.py:991]      + PascalBoxes_Precision/mAP@0.5IOU: 0.639662
INFO:tensorflow:        + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Mast: 0.003302
I0429 08:48:00.686780 139898413045568 model_lib_v2.py:991]      + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Mast: 0.003302
INFO:tensorflow:        + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Signal: 0.915684
I0429 08:48:00.687195 139898413045568 model_lib_v2.py:991]      + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Signal: 0.915684
INFO:tensorflow:        + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Schild: 1.000000
I0429 08:48:00.687579 139898413045568 model_lib_v2.py:991]      + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Schild: 1.000000
INFO:tensorflow:        + Loss/RPNLoss/localization_loss: 0.022732
I0429 08:48:00.687929 139898413045568 model_lib_v2.py:991]      + Loss/RPNLoss/localization_loss: 0.022732
INFO:tensorflow:        + Loss/RPNLoss/objectness_loss: 0.009871
I0429 08:48:00.688288 139898413045568 model_lib_v2.py:991]      + Loss/RPNLoss/objectness_loss: 0.009871
INFO:tensorflow:        + Loss/BoxClassifierLoss/localization_loss: 0.028553
I0429 08:48:00.688646 139898413045568 model_lib_v2.py:991]      + Loss/BoxClassifierLoss/localization_loss: 0.028553
INFO:tensorflow:        + Loss/BoxClassifierLoss/classification_loss: 0.043199
I0429 08:48:00.689037 139898413045568 model_lib_v2.py:991]      + Loss/BoxClassifierLoss/classification_loss: 0.043199
INFO:tensorflow:        + Loss/regularization_loss: 0.000000
I0429 08:48:00.689360 139898413045568 model_lib_v2.py:991]      + Loss/regularization_loss: 0.000000
INFO:tensorflow:        + Loss/total_loss: 0.104354
I0429 08:48:00.689714 139898413045568 model_lib_v2.py:991]      + Loss/total_loss: 0.104354

e10101 commented 3 years ago

Hi @MariaKalt , can you please try another metrics_set: weighted_pascal_voc_detection_metrics? Let's see what you will get.

MariaKalt commented 3 years ago

Hi! I have tried weighted_pascal_voc_detection_metrics when I originally opened the issue, it has the same problems as pascal_voc_detection_metrics. Would you like me to run it again to send you some details?

Edit: Why ask stupid questions when I can just quickly run the evaluation 😄

These are the results using weighted_pascal_voc_detection_metrics:

I0430 07:12:34.152458 140674817054528 object_detection_evaluation.py:1335] average_precision: 0.003302
I0430 07:12:34.160104 140674817054528 object_detection_evaluation.py:1335] average_precision: 0.915684
I0430 07:12:34.164542 140674817054528 object_detection_evaluation.py:1335] average_precision: 1.000000
INFO:tensorflow:Eval metrics at step 1000
I0430 07:12:34.179271 140674817054528 model_lib_v2.py:988] Eval metrics at step 1000
INFO:tensorflow:    + WeightedPascalBoxes_Precision/mAP@0.5IOU: 0.014986
I0430 07:12:34.216057 140674817054528 model_lib_v2.py:991]  + WeightedPascalBoxes_Precision/mAP@0.5IOU: 0.014986
INFO:tensorflow:    + WeightedPascalBoxes_PerformanceByCategory/AP@0.5IOU/Mast: 0.003302
I0430 07:12:34.216875 140674817054528 model_lib_v2.py:991]  + WeightedPascalBoxes_PerformanceByCategory/AP@0.5IOU/Mast: 0.003302
INFO:tensorflow:    + WeightedPascalBoxes_PerformanceByCategory/AP@0.5IOU/Signal: 0.915684
I0430 07:12:34.217378 140674817054528 model_lib_v2.py:991]  + WeightedPascalBoxes_PerformanceByCategory/AP@0.5IOU/Signal: 0.915684
INFO:tensorflow:    + WeightedPascalBoxes_PerformanceByCategory/AP@0.5IOU/Schild: 1.000000
I0430 07:12:34.217864 140674817054528 model_lib_v2.py:991]  + WeightedPascalBoxes_PerformanceByCategory/AP@0.5IOU/Schild: 1.000000
INFO:tensorflow:    + Loss/RPNLoss/localization_loss: 0.022732
I0430 07:12:34.218256 140674817054528 model_lib_v2.py:991]  + Loss/RPNLoss/localization_loss: 0.022732
INFO:tensorflow:    + Loss/RPNLoss/objectness_loss: 0.009854
I0430 07:12:34.218662 140674817054528 model_lib_v2.py:991]  + Loss/RPNLoss/objectness_loss: 0.009854
INFO:tensorflow:    + Loss/BoxClassifierLoss/localization_loss: 0.028553
I0430 07:12:34.219060 140674817054528 model_lib_v2.py:991]  + Loss/BoxClassifierLoss/localization_loss: 0.028553
INFO:tensorflow:    + Loss/BoxClassifierLoss/classification_loss: 0.043199
I0430 07:12:34.219455 140674817054528 model_lib_v2.py:991]  + Loss/BoxClassifierLoss/classification_loss: 0.043199
INFO:tensorflow:    + Loss/regularization_loss: 0.000000
I0430 07:12:34.219851 140674817054528 model_lib_v2.py:991]  + Loss/regularization_loss: 0.000000
INFO:tensorflow:    + Loss/total_loss: 0.104337
I0430 07:12:34.220247 140674817054528 model_lib_v2.py:991]  + Loss/total_loss: 0.104337

e10101 commented 3 years ago

Hi @MariaKalt, thanks for your reply. it seems like your were using Faster RCNN, I haven't used it before. so i will try to test my dataset and metrics with a RCNN network.

btw,I have more questions about your settings: 1) how many images for each category? (so your data is imbalanced or?) 2) Can you please try to regenerate your TFRecord dataset with Signal labeled as 1 and Mast labeled as 2. We want to see is the order matters or not. 3) Can you please try to train with an SSD?

If you evaluate your checkpoints, can you please evaluate multiple metrics at the same time, then we can compare them together?

eval_config {
  metrics_set: "coco_detection_metrics"
  metrics_set: "weighted_pascal_voc_detection_metrics"
  metrics_set: "pascal_voc_detection_metrics"
  ...
}

MariaKalt commented 3 years ago

Hi @e10101!

Data is approximately balanced with 91 Mast - 116 Signal - 116 Schild

MariaKalt commented 3 years ago

I have changed the label map as you requested and created new train and test records using this script. The evaluation results are:

Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.328
Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.656
Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.313
Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.207
Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.377
Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.300
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.301
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.438
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.438
Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.280
Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.444
Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.470
I0504 12:21:02.381466 140284735031104 object_detection_evaluation.py:1335] average_precision: 0.001401
I0504 12:21:02.389808 140284735031104 object_detection_evaluation.py:1335] average_precision: 0.789894
I0504 12:21:02.397349 140284735031104 object_detection_evaluation.py:1335] average_precision: 0.930618
I0504 12:21:02.403618 140284735031104 object_detection_evaluation.py:1335] average_precision: 0.001401
I0504 12:21:02.407767 140284735031104 object_detection_evaluation.py:1335] average_precision: 0.789894
I0504 12:21:02.417206 140284735031104 object_detection_evaluation.py:1335] average_precision: 0.930618
INFO:tensorflow:Eval metrics at step 1000
I0504 12:21:02.420348 140284735031104 model_lib_v2.py:988] Eval metrics at step 1000
INFO:tensorflow:        + DetectionBoxes_Precision/mAP: 0.327969
I0504 12:21:02.445343 140284735031104 model_lib_v2.py:991]      + DetectionBoxes_Precision/mAP: 0.327969
INFO:tensorflow:        + DetectionBoxes_Precision/mAP@.50IOU: 0.655533
I0504 12:21:02.446773 140284735031104 model_lib_v2.py:991]      + DetectionBoxes_Precision/mAP@.50IOU: 0.655533
INFO:tensorflow:        + DetectionBoxes_Precision/mAP@.75IOU: 0.312963
I0504 12:21:02.447920 140284735031104 model_lib_v2.py:991]      + DetectionBoxes_Precision/mAP@.75IOU: 0.312963
INFO:tensorflow:        + DetectionBoxes_Precision/mAP (small): 0.206514
I0504 12:21:02.449092 140284735031104 model_lib_v2.py:991]      + DetectionBoxes_Precision/mAP (small): 0.206514
INFO:tensorflow:        + DetectionBoxes_Precision/mAP (medium): 0.377365
I0504 12:21:02.450265 140284735031104 model_lib_v2.py:991]      + DetectionBoxes_Precision/mAP (medium): 0.377365
INFO:tensorflow:        + DetectionBoxes_Precision/mAP (large): 0.300369
I0504 12:21:02.451397 140284735031104 model_lib_v2.py:991]      + DetectionBoxes_Precision/mAP (large): 0.300369
INFO:tensorflow:        + DetectionBoxes_Recall/AR@1: 0.300513
I0504 12:21:02.452573 140284735031104 model_lib_v2.py:991]      + DetectionBoxes_Recall/AR@1: 0.300513
INFO:tensorflow:        + DetectionBoxes_Recall/AR@10: 0.437607
I0504 12:21:02.453720 140284735031104 model_lib_v2.py:991]      + DetectionBoxes_Recall/AR@10: 0.437607
INFO:tensorflow:        + DetectionBoxes_Recall/AR@100: 0.437607
I0504 12:21:02.454845 140284735031104 model_lib_v2.py:991]      + DetectionBoxes_Recall/AR@100: 0.437607
INFO:tensorflow:        + DetectionBoxes_Recall/AR@100 (small): 0.280000
I0504 12:21:02.455842 140284735031104 model_lib_v2.py:991]      + DetectionBoxes_Recall/AR@100 (small): 0.280000
INFO:tensorflow:        + DetectionBoxes_Recall/AR@100 (medium): 0.443981
I0504 12:21:02.456828 140284735031104 model_lib_v2.py:991]      + DetectionBoxes_Recall/AR@100 (medium): 0.443981
INFO:tensorflow:        + DetectionBoxes_Recall/AR@100 (large): 0.470455
I0504 12:21:02.457929 140284735031104 model_lib_v2.py:991]      + DetectionBoxes_Recall/AR@100 (large): 0.470455
INFO:tensorflow:        + WeightedPascalBoxes_Precision/mAP@0.5IOU: 0.011257
I0504 12:21:02.458916 140284735031104 model_lib_v2.py:991]      + WeightedPascalBoxes_Precision/mAP@0.5IOU: 0.011257
INFO:tensorflow:        + WeightedPascalBoxes_PerformanceByCategory/AP@0.5IOU/Mast: 0.001401
I0504 12:21:02.459919 140284735031104 model_lib_v2.py:991]      + WeightedPascalBoxes_PerformanceByCategory/AP@0.5IOU/Mast: 0.001401
INFO:tensorflow:        + WeightedPascalBoxes_PerformanceByCategory/AP@0.5IOU/Signal: 0.789894
I0504 12:21:02.460912 140284735031104 model_lib_v2.py:991]      + WeightedPascalBoxes_PerformanceByCategory/AP@0.5IOU/Signal: 0.789894
INFO:tensorflow:        + WeightedPascalBoxes_PerformanceByCategory/AP@0.5IOU/Schild: 0.930618
I0504 12:21:02.461950 140284735031104 model_lib_v2.py:991]      + WeightedPascalBoxes_PerformanceByCategory/AP@0.5IOU/Schild: 0.930618
INFO:tensorflow:        + PascalBoxes_Precision/mAP@0.5IOU: 0.573971
I0504 12:21:02.462922 140284735031104 model_lib_v2.py:991]      + PascalBoxes_Precision/mAP@0.5IOU: 0.573971
INFO:tensorflow:        + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Mast: 0.001401
I0504 12:21:02.463908 140284735031104 model_lib_v2.py:991]      + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Mast: 0.001401
INFO:tensorflow:        + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Signal: 0.789894
I0504 12:21:02.464867 140284735031104 model_lib_v2.py:991]      + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Signal: 0.789894
INFO:tensorflow:        + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Schild: 0.930618
I0504 12:21:02.465878 140284735031104 model_lib_v2.py:991]      + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Schild: 0.930618
INFO:tensorflow:        + Loss/localization_loss: 0.392880
I0504 12:21:02.466730 140284735031104 model_lib_v2.py:991]      + Loss/localization_loss: 0.392880
INFO:tensorflow:        + Loss/classification_loss: 0.811016
I0504 12:21:02.467600 140284735031104 model_lib_v2.py:991]      + Loss/classification_loss: 0.811016
INFO:tensorflow:        + Loss/regularization_loss: 0.250991
I0504 12:21:02.468541 140284735031104 model_lib_v2.py:991]      + Loss/regularization_loss: 0.250991
INFO:tensorflow:        + Loss/total_loss: 1.454887
I0504 12:21:02.469535 140284735031104 model_lib_v2.py:991]      + Loss/total_loss: 1.454887

Edit: I forgot to evaluate using multiple metrics, this is no remedied.

MariaKalt commented 3 years ago

I trained using the original label map and records using the pre-trained model SSD ResNet50 640x640. The evaluation results are:

Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.508
Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.826
Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.587
Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.428
Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.512
Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.554
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.412
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.628
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.643
Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.597
Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.658
Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.657
I0504 12:24:42.908806 140261661411136 object_detection_evaluation.py:1335] average_precision: 0.005840
I0504 12:24:42.913674 140261661411136 object_detection_evaluation.py:1335] average_precision: 0.601678
I0504 12:24:42.918410 140261661411136 object_detection_evaluation.py:1335] average_precision: 0.994505
I0504 12:24:43.024325 140261661411136 object_detection_evaluation.py:1335] average_precision: 0.005840
I0504 12:24:43.055220 140261661411136 object_detection_evaluation.py:1335] average_precision: 0.601678
I0504 12:24:43.085309 140261661411136 object_detection_evaluation.py:1335] average_precision: 0.994505
INFO:tensorflow:Eval metrics at step 1000
I0504 12:24:43.090143 140261661411136 model_lib_v2.py:988] Eval metrics at step 1000
INFO:tensorflow:        + DetectionBoxes_Precision/mAP: 0.508041
I0504 12:24:43.139799 140261661411136 model_lib_v2.py:991]      + DetectionBoxes_Precision/mAP: 0.508041
INFO:tensorflow:        + DetectionBoxes_Precision/mAP@.50IOU: 0.825912
I0504 12:24:43.141306 140261661411136 model_lib_v2.py:991]      + DetectionBoxes_Precision/mAP@.50IOU: 0.825912
INFO:tensorflow:        + DetectionBoxes_Precision/mAP@.75IOU: 0.586969
I0504 12:24:43.142377 140261661411136 model_lib_v2.py:991]      + DetectionBoxes_Precision/mAP@.75IOU: 0.586969
INFO:tensorflow:        + DetectionBoxes_Precision/mAP (small): 0.428177
I0504 12:24:43.143496 140261661411136 model_lib_v2.py:991]      + DetectionBoxes_Precision/mAP (small): 0.428177
INFO:tensorflow:        + DetectionBoxes_Precision/mAP (medium): 0.512287
I0504 12:24:43.144407 140261661411136 model_lib_v2.py:991]      + DetectionBoxes_Precision/mAP (medium): 0.512287
INFO:tensorflow:        + DetectionBoxes_Precision/mAP (large): 0.553873
I0504 12:24:43.145426 140261661411136 model_lib_v2.py:991]      + DetectionBoxes_Precision/mAP (large): 0.553873
INFO:tensorflow:        + DetectionBoxes_Recall/AR@1: 0.411624
I0504 12:24:43.146368 140261661411136 model_lib_v2.py:991]      + DetectionBoxes_Recall/AR@1: 0.411624
INFO:tensorflow:        + DetectionBoxes_Recall/AR@10: 0.628205
I0504 12:24:43.147215 140261661411136 model_lib_v2.py:991]      + DetectionBoxes_Recall/AR@10: 0.628205
INFO:tensorflow:        + DetectionBoxes_Recall/AR@100: 0.643248
I0504 12:24:43.148077 140261661411136 model_lib_v2.py:991]      + DetectionBoxes_Recall/AR@100: 0.643248
INFO:tensorflow:        + DetectionBoxes_Recall/AR@100 (small): 0.597500
I0504 12:24:43.148993 140261661411136 model_lib_v2.py:991]      + DetectionBoxes_Recall/AR@100 (small): 0.597500
INFO:tensorflow:        + DetectionBoxes_Recall/AR@100 (medium): 0.657870
I0504 12:24:43.149947 140261661411136 model_lib_v2.py:991]      + DetectionBoxes_Recall/AR@100 (medium): 0.657870
INFO:tensorflow:        + DetectionBoxes_Recall/AR@100 (large): 0.656818
I0504 12:24:43.150854 140261661411136 model_lib_v2.py:991]      + DetectionBoxes_Recall/AR@100 (large): 0.656818
INFO:tensorflow:        + WeightedPascalBoxes_Precision/mAP@0.5IOU: 0.015226
I0504 12:24:43.151702 140261661411136 model_lib_v2.py:991]      + WeightedPascalBoxes_Precision/mAP@0.5IOU: 0.015226
INFO:tensorflow:        + WeightedPascalBoxes_PerformanceByCategory/AP@0.5IOU/Signal: 0.005840
I0504 12:24:43.152562 140261661411136 model_lib_v2.py:991]      + WeightedPascalBoxes_PerformanceByCategory/AP@0.5IOU/Signal: 0.005840
INFO:tensorflow:        + WeightedPascalBoxes_PerformanceByCategory/AP@0.5IOU/Mast: 0.601678
I0504 12:24:43.153524 140261661411136 model_lib_v2.py:991]      + WeightedPascalBoxes_PerformanceByCategory/AP@0.5IOU/Mast: 0.601678
INFO:tensorflow:        + WeightedPascalBoxes_PerformanceByCategory/AP@0.5IOU/Schild: 0.994505
I0504 12:24:43.154500 140261661411136 model_lib_v2.py:991]      + WeightedPascalBoxes_PerformanceByCategory/AP@0.5IOU/Schild: 0.994505
INFO:tensorflow:        + PascalBoxes_Precision/mAP@0.5IOU: 0.534008
I0504 12:24:43.155525 140261661411136 model_lib_v2.py:991]      + PascalBoxes_Precision/mAP@0.5IOU: 0.534008
INFO:tensorflow:        + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Signal: 0.005840
I0504 12:24:43.156512 140261661411136 model_lib_v2.py:991]      + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Signal: 0.005840
INFO:tensorflow:        + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Mast: 0.601678
I0504 12:24:43.157631 140261661411136 model_lib_v2.py:991]      + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Mast: 0.601678
INFO:tensorflow:        + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Schild: 0.994505
I0504 12:24:43.158624 140261661411136 model_lib_v2.py:991]      + PascalBoxes_PerformanceByCategory/AP@0.5IOU/Schild: 0.994505
INFO:tensorflow:        + Loss/RPNLoss/localization_loss: 0.019921
I0504 12:24:43.159511 140261661411136 model_lib_v2.py:991]      + Loss/RPNLoss/localization_loss: 0.019921
INFO:tensorflow:        + Loss/RPNLoss/objectness_loss: 0.008276
I0504 12:24:43.160426 140261661411136 model_lib_v2.py:991]      + Loss/RPNLoss/objectness_loss: 0.008276
INFO:tensorflow:        + Loss/BoxClassifierLoss/localization_loss: 0.033792
I0504 12:24:43.161395 140261661411136 model_lib_v2.py:991]      + Loss/BoxClassifierLoss/localization_loss: 0.033792
INFO:tensorflow:        + Loss/BoxClassifierLoss/classification_loss: 0.046665
I0504 12:24:43.162309 140261661411136 model_lib_v2.py:991]      + Loss/BoxClassifierLoss/classification_loss: 0.046665
INFO:tensorflow:        + Loss/regularization_loss: 0.000000
I0504 12:24:43.163203 140261661411136 model_lib_v2.py:991]      + Loss/regularization_loss: 0.000000
INFO:tensorflow:        + Loss/total_loss: 0.108655
I0504 12:24:43.164107 140261661411136 model_lib_v2.py:991]      + Loss/total_loss: 0.108655

Edit: I forgot to evaluate using multiple metrics, this is no remedied.

e10101 commented 3 years ago

Hi @MariaKalt , one more question, when you were installing object_detection, you used pip install . or pip install -e .?

If you used the first one, then the updates might not apply correctly.

MariaKalt commented 3 years ago

Hi @e10101 :) I followed the tutorial, this is how I installed it:

# From within TensorFlow/models/research/
protoc object_detection/protos/*.proto --python_out=.

# Install COCO API
pip install cython
pip install git+https://github.com/philferriere/cocoapi.git#subdirectory=PythonAPI

cp object_detection/packages/tf2/setup.py .
python -m pip install .

Would it help if I posted the results of the model builder test (python object_detection/builders/model_builder_tf2_test.py)?

e10101 commented 3 years ago

Please uninstall object_detection, then install it with the --editable flag, then apply my changes.

python -m pip install -e .

Otherwise, what I put in https://github.com/tensorflow/models/pull/9956 will not work. :)

MaxPRon commented 3 years ago

@e10101 @MariaKalt Has this issue been solved by now ? I saw the the PR is not yet merged.

I'm running in the exact same issue when training:

In this case car is also the most frequent class, hence it shouldn't be nearly as low as here. Also I ran the coco_evaluation_metrics on the same training and although there are no direct APs per class I can see the impact in the mAP value.

DEV4INO commented 2 years ago

I have the same problem and have investigated it.

In the line where the ground truth class are set , the argmax function is used. It returns the index of the largest value. But if all values are 0 (so no ground truth class exists) it refers to the first element. So the ground truth class is 1. The evaluation uses this to calculate the score per class.
It's possible to set the class to 0 afterwards, so the evaluation should be correct.

What I don't know if this has side effects and it's also not very elegant. What do you think about this solution?

FPerezHernandez92 commented 2 years ago

I have the same problem and have investigated it.

In the line where the ground truth class are set , the argmax function is used. It returns the index of the largest value. But if all values are 0 (so no ground truth class exists) it refers to the first element. So the ground truth class is 1. The evaluation uses this to calculate the score per class. It's possible to set the class to 0 afterwards, so the evaluation should be correct.

What I don't know if this has side effects and it's also not very elegant. What do you think about this solution?

It works for me, thanks!

JerickoDG commented 1 year ago

Hi, @GoldschmittGabriel . I would like to clarify what I understood from what your investigation implies. Do you suggest making the ids in the label_map.pbtxt zero-indexed (e.g., 0-2 for three-classes) similar to this?

BEFORE

item { name : 'dog' id : 1 } item { name : 'cat' id : 2 } item { name : 'bird' id : 3 }

AFTER

item { name : 'dog' id : 0 } item { name : 'cat' id : 1 } item { name : 'bird' id : 2 }

I hope for your response regarding my question. Thank you.

DEV4INO commented 1 year ago

Hey @JerickoDG, I'm not sure if we had a label_map.pbtxt starting with ID 0 or ID 1. I am also not sure if this is changing something, but start with 1 as suggested. It's already more than one year ago, which means I am also not so familiar with the code anymore, but would like to try to explain it to you.

This is a method used in the evaluation steps. All pictures have been evaluated already. It's only preparing the results. In some cases (I don't know exactly why (maybe batch_size)) ground truth rows with only zeros [0, 0, 0] (in your case, "not a dog, not a cat, and also not a bird"). However, the argmax function with parameter [0,0,0] returns 0 (see image). grafik

To this 0 the code adds 1, so the result is 1. The meaning of your entry changes from "no class at all" to "it's a dog", but it's not true. So the occurrence of the class dogs is higher than it should be. For example, you could use precision to analyze the result: (true positive)/(true positive + false negative) and false negative is basically (ground truth - true positive), but ground truth is higher than it should be, because of the mistake. So the calculations are wrong. My code tries to fix it. But, not sure if this is still an issue or it's already fixed.

JerickoDG commented 1 year ago

Hi, @GoldschmittGabriel . I appreciate your response and I understand it clearly now. As I still seem to experience it, I think it is not yet fixed. I would like to try your proposed solution. May I confirm if I should insert the codeblock you presented on the second image after the codeblock on the first image to make it work?

DEV4INO commented 1 year ago

I created a pr, you could try it out and report if this fixes the issue. would be cool, because I don't have enough time to test it in detail right now.

The line that I added, creates an array with one entry for each line. If there is an entry the entry of the line is 1 if all are zeros the entry is 0. This is added to the result. So only rows with at least one entry receives the +1 offset, what is exactly the same, what I tried to achieve back then.

But you can also just add the few lines, after the shown block, should also work.

tensorflow / models

pascal_voc_detection_metrics gives very low scores for first category in label_map #9927