rafaelpadilla / review_object_detection_metrics

Object Detection Metrics. 14 object detection metrics: mean Average Precision (mAP), Average Recall (AR), Spatio-Temporal Tube Average Precision (STT-AP). This project supports different bounding box formats as in COCO, PASCAL, Imagenet, etc.
Other
1.09k stars 215 forks source link

Load images and annotations from the same folder #67

Closed f-fl0 closed 3 years ago

f-fl0 commented 3 years ago

Problem description

This PR fixes an issue when the images and annotation files are stored in the same folder. Here is an example dataset causing problem: 3 images, the 3 corresponding annotation files and the classes.txt file all at the root folder of the dataset.

.
├── 000000.jpg
├── 000000.txt
├── 000001.jpg
├── 000001.txt
├── 000002.jpg
├── 000002.txt
├── classes.txt

With the version in the main branch, some images might be skipped because https://github.com/rafaelpadilla/review_object_detection_metrics/blob/49a614a6ca59e39f2aecd085730cc2e4d30cd886/src/utils/general_utils.py#L186 relies on os.walk which iterates through the files of the specified folder in an arbitrary order (depending on the python version os.walk uses os.listdir or os.scandir but both of them returns files in an arbitrary order: official documentation https://docs.python.org/3/library/os.html#os.listdir, https://docs.python.org/3/library/os.html#os.scandir). It is then possible that when looking for 000001.jpg for example, 000001.txt is found first and because extensions are ignored in find_file when looking for the image file corresponding to an annotation file, 000001.txt is incorrectly returned. As it is not a valid image file, properties such as image resolution cannot be retrieved and thus the 000001 is skipped (e.g. https://github.com/rafaelpadilla/review_object_detection_metrics/blob/49a614a6ca59e39f2aecd085730cc2e4d30cd886/src/utils/converter.py#L329-L331).

Proposed solution

This PR introduces a find_image_file function that can be used to find the corresponding image file given an annotation filename. It relies on a modified version of the find_file function which can be called with a list of allowed extensions when trying to match filenames. In find_image_file, find_file is called with allowed_extensions set to a list of the most common image file extension .bmp, .jpg, .jpeg, .png.

rafaelpadilla commented 3 years ago

Hi @f-fl0

Thank you for your PR. This PR indeed allows the user to have images and ground-truths or images and detections in the same folder.

Nevertheless, issue #68 raised the flag about another issue: Extensions with upper case such as "PNG" instead of "png".

I have just fixed this problem and pushed it to the master in this commit.

Could you please, check if your PR works as expected with this new commit?

Best regards!

f-fl0 commented 3 years ago

I confirmed my PR still work as expected after getting the latest changes.