Difference between TFODAPI mAP and Confusion Matrix mAP

svpino / tf_object_detection_cm

Confusion Matrix in Object Detection with TensorFlow

78 stars 36 forks source link

Difference between TFODAPI mAP and Confusion Matrix mAP #28

Closed rosswin closed 2 years ago

rosswin commented 3 years ago

Hello,

Thank you very much for sharing your code!

I am finding that my evaluation with Tensorflow Object Detection API (using model_main.py in eval mode) reports a mAP@0.5IOU of 0.361.

I then use TFODAPI's infer_detections.py to export a tfrecord file that contains the predictions and ground truth boxes. This tfrecord is then put into this script to generate a confusion matrix, which seems to work perfectly! However when I take the average of each class's AP@0.5IOU I get a different result: 0.639 mAP@0.5IOU.

I have tried to vary the CONFIDENCE_THRESHOLD, however I am not finding that I ever get an mAP@0.5IOU that matches TFODAPI.

Do you have any idea why these values might be different? I am sure there is a straight forward answer, however I have been reviewing the code at TFODAPI's COCO evaluation metrics and this repository, and I cannot figure this out!

Thank you!

sayakpaul commented 3 years ago

@rosswin on a different note, I see many mAP values recorded by TFOD API. I am particularly interested to know what does mAP@0.5IOU stands for since there's a separate mAP as reported by the API as well.

rosswin commented 3 years ago

Thanks for posting here, @sayakpaul. I kept working on this issue and I think I figured it out. I forgot to come back and update, so here I am!

I am still a student on these challenging topics, so hopefully @svpino or another guru can fact check me below...

TFODAPI's mAP - I believe this is using the Microsoft COCO method, which is a 101-point interpolated mAP. In short, it is calculating the Average Precision (AP) at 101 different IOU levels, and averaging all of those APs into a single mAP.
TFODAPI's mAP@0.5IOU - I believe this is the mAP at just the 50% IOU level- so same as above, just not averaged across 101 different level of IOU.
This repository's mAP@0.5IOU - This repository allows the user to manually select various IOU and CONFIDENCE_SCORE values to register a detection (a true positive). This script then reports both the AP and Average Recall (AR) scores at the given IOU and CONFIDENCE_SCORE thresholds. There is no averaging across the various recall values occurring, and that is why the numbers don't match.

TLDR; MS COCO's mAP calculation averages the AP across 101 IOU levels. The other metrics hold the IOU fixed.

I hope this helps!

Here is the webpage that I seem to come back to most while studying this

sayakpaul commented 3 years ago

Thank you so much @rosswin!

Annieliaquat commented 2 years ago

Hello,

Thank you very much for sharing your code!

I am finding that my evaluation with Tensorflow Object Detection API (using model_main.py in eval mode) reports a mAP@0.5IOU of 0.361.

I then use TFODAPI's infer_detections.py to export a tfrecord file that contains the predictions and ground truth boxes. This tfrecord is then put into this script to generate a confusion matrix, which seems to work perfectly! However when I take the average of each class's AP@0.5IOU I get a different result: 0.639 mAP@0.5IOU.

I have tried to vary the CONFIDENCE_THRESHOLD, however I am not finding that I ever get an mAP@0.5IOU that matches TFODAPI.

Do you have any idea why these values might be different? I am sure there is a straight forward answer, however I have been reviewing the code at TFODAPI's COCO evaluation metrics and this repository, and I cannot figure this out!

Thank you!

Can you please help me on how to find this file --detections_record=testing_detections.record to generate confusion matrix??

rosswin commented 2 years ago

Hi @Annieliaquat

I don't have enough details from you to make full diagnosis, but I believe you may be using an older workflow from the TF Object Detection API v1. The latest version of TF Object Detection API is version 2.

In the old API workflow trained model detection/evaluation was accomplished using an infer_detections.py script included in the API. This would would use a frozen_inference_graph to produce a detections.record with model detections. We would then give this detections.record file to confusion_matrix.py to generate confusion matrix.

In the new worfklow (TF v2) there is no infer_detections and the frozen_inference_graph is no longer used. You now need to export your trained model to a saved_model file (exporter_main_v2.py), and then provide that saved_model directory and your test data set (as tfrecord) to the newest version of the confusion matrix script: confusion_matrix_tf2.py

Also, I believe this comment may be off topic to the original issue, and since this issue hasn't had traffic in a while I am closing. If you need additional help perhaps open a new ticket to get better visibility? I hope this all helps!

Annieliaquat commented 2 years ago

Hi @Annieliaquat

I don't have enough details from you to make full diagnosis, but I believe you may be using an older workflow from the TF Object Detection API v1. The latest version of TF Object Detection API is version 2.

In the old API workflow trained model detection/evaluation was accomplished using an infer_detections.py script included in the API. This would would use a frozen_inference_graph to produce a detections.record with model detections. We would then give this detections.record file to confusion_matrix.py to generate confusion matrix.

In the new worfklow (TF v2) there is no infer_detections and the frozen_inference_graph is no longer used. You now need to export your trained model to a saved_model file (exporter_main_v2.py), and then provide that saved_model directory and your test data set (as tfrecord) to the newest version of the confusion matrix script: confusion_matrix_tf2.py

Also, I believe this comment may be off topic to the original issue, and since this issue hasn't had traffic in a while I am closing. If you need additional help perhaps open a new ticket to get better visibility? I hope this all helps!

I am using Tensorflow v2.. I have used saved_model and test.tfRecord to generate confusion matrix but it is not generating it.