WongKinYiu / YOLO

An MIT rewrite of YOLOv9
MIT License
457 stars 46 forks source link

AP can be larger than 1 #38

Open cewang94 opened 1 month ago

cewang94 commented 1 month ago

Describe the bug

I was trying to verify the validation accuracy of the model, and I got an abnormally high result. I dug a little bit deeper and realized that sometimes ap produced in calculate_map() in bounding_box_utils can be larger than one, which is artificially inflating the mAP of the model. Specifically, there's not a check to ensure that each prediction is matched to only one ground truth label, resulting in more true positives than ground truths, which results in the recall potentially being larger than 1, which results in the AP being larger than 1 in some cases.

To Reproduce

Steps to reproduce the behavior:

  1. add a line e.g. if ap > 1: print(ap) in the calculate_map() function

Expected behavior

Each ground truth label should only have one corresponding true positive prediction.

Screenshots

image

System Info (please complete the following ## information):

henrytsui000 commented 1 month ago

Hi,

Thank you for your report! I will work on addressing this issue in the coming days. In the meantime, I recommend using the JSON dataset as the source and pycocotools to calculate the mAP.

Hao-Tang Tsui

cewang94 commented 1 month ago

Hi!

Thanks for your reply! I used pycocotools to calculate the mAP, and realized that the results are quite a bit lower than the paper: image when using the conf and min_iou from the yolov9 official github image when using the default conf and min_iou from this repo.

This is using yolov9-c

Is there a setting that will help me reproduce the results from the YOLOv9 paper? (0.53 mAP)

henrytsui000 commented 1 month ago

Hi,

The default configuration for IoU and confidence is optimized for visualization purposes, but these values are not ideal for achieving the best mAP. The discrepancy with the paper's results is due to the inference strategy used there, which involves sorting images by their width-to-height ratio and generating an anchor grid for each ratio. In this project, I haven't implemented that strategy, as it does not affect inference time and primarily helps in achieving a higher mAP. If you are looking for the same accuracy as in the paper, you can refer to the repository you mentioned. The strategy implementation has been added to our to-do list~

HenryTsui

cewang94 commented 1 month ago

Hi Henry,

Thanks for your reply! Could you point me to a more specific place in the official github where I can find their implementation of the aforementioned strategy? Thank you.

henrytsui000 commented 1 month ago

Hi,

You can find the operation at this link.

# utils/dataloaders.py Line 544
if self.rect:
    # Sort by aspect ratio
    s = self.shapes  # wh
    ar = s[:, 1] / s[:, 0]  # aspect ratio
    irect = ar.argsort()
    self.im_files = [self.im_files[i] for i in irect]
    self.label_files = [self.label_files[i] for i in irect]
    self.labels = [self.labels[i] for i in irect]
    self.segments = [self.segments[i] for i in irect]

Best regards,
Henry Tsui

cewang94 commented 1 month ago

Hi Henry,

Thanks for your help! If I'm not wrong this helps during training correct? Were the weights provided in this repo trained from scratch, hence the discrepancy with the official YOLOv9 version? If I want to match the official YOLOv9 version, will I have to implement this rectangular training strategy and retrain the model from scratch?