DetectionDataset.from_yolo bad conversion with autodistill_grounded_sam DetectionDataset object

Youho99 commented 7 months ago

Search before asking

[X] I have searched the Supervision issues and found no similar bug report.

Bug

The sv.DetectionDataset.from_yolo function has abnormal behavior when processing DetectionDataset objects from autodistil_grounded_sam

When I use sv.DetectionDataset.from_yolo on a dataset generated via base_model.label (base_model being GroundedSAM), I get a different detection number of the object returned by base_model.label, whereas this is supposed to only carry out a conversion.

Note that I did the test with a GroundingDino base_model, and I did not encounter the problem.

The detections returned can be lower or higher than the basic detections (900 maximum according to what I have experienced with a confidence of 0.00)

Environment

Supervision = 0.19.0

Minimal Reproducible Example

from autodistill_grounded_sam import GroundedSAM
from autodistill.detection import CaptionOntology
from pathlib import Path
import supervision as sv

base_model = GroundedSAM(
    ontology=CaptionOntology(
        {
            "screen": "a computer screen",
        }
    ),
    box_threshold = 0.00
)

# Put the cat image on your input directory
# Put your input directory path
input_dir = "/home/ggiret/Téléchargements/chat"
output_dir = "test/"

results = base_model.label(
        input_folder=input_dir,
        extension=".png",
        output_folder=output_dir, 
        record_confidence=True)

# Put the correct image name if the name changed
len(results.annotations['images.png'].class_id)

900

sv_dataset = sv.DetectionDataset.from_yolo(
        images_directory_path=Path(output_dir).joinpath("images"),
        annotations_directory_path=Path(output_dir).joinpath("annotations"),
        data_yaml_path=Path(output_dir).joinpath("data.yaml"))

# Put the correct image name if the name changed
len(sv_dataset.annotations['test/images/images.jpg'].class_id)

1100

Additional

This is the image I used for the 1100 number of class_id result.

Are you willing to submit a PR?

[ ] Yes I'd like to help by submitting a PR!

SkalskiP commented 7 months ago

Hi @Youho99 👋🏻 The reason might be the disconnected masks generated by GroundedSAM - that is, two separate masks represent one bounding box. The YOLO format does not allow for such a situation, so such objects are split and saved separately. As a result, you end up with more objects.

Youho99 commented 7 months ago

@SkalskiP This may indeed be a reason. But in some cases, I also end up with fewer items.

SkalskiP commented 7 months ago

@Youho99 because sometimes you get detection that gets removed because it is too large or too small. There might be a bug. But I need to see a concrete, reproducible example of something working wrong.

Youho99 commented 7 months ago

In fact, my goal is to directly convert the dataset into a format that CVAT supports. CVAT does not support segmentation with the YOLO format, so in any case I will not use it as such. I will convert to COCO format via Supervision, I will import the annotations to CVAT, and export them to YOLO format from CVAT. Then I will compare them with the result of the label function of autodistill_grounded_sam.

I will get back to you if there is a difference already on that (probably on the autodistill_grounded_sam repo if there is a problem).

Indeed, there is certainly a bug with the from_yolo for an object generated with autodistill_grounded_sam, but I have no application case to verify the veracity of this (because I am not going to use it as a result).

Youho99 commented 7 months ago

@SkalskiP Edit:

I face the same problem with the .as_coco() function to put my dataset in coco format. Except that here, the function crashes ~~because of the difference in detection (1100 versus 900) for the cat image given above. For other images it doesn't crash, but potentially the problem is just underwater.~~

I get the error with any image

My first conclusion is that we cannot export a dataset generated via Grounded SAM (autodistill_grounded_sam) at the moment.

# use the code above
results.as_coco(
         #images_directory_path=os.path.join("temp_result_coco", 'images'),
         annotations_path=os.path.join("temp_result_coco", 'labels.json'))

2024-03-27 10:28:06.724 Uncaught app exception Traceback (most recent call last): File "/home/ggiret/miniconda3/envs/deeplabel/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 542, in _run_script exec(code, module.dict) File "/home/ggiret/Documents/deeplabel/pages/3_Labels_Generation.py", line 64, in auto_labeling() File "/home/ggiret/Documents/deeplabel/backend/labels_generation.py", line 27, in auto_labeling st.session_state.dataset.as_coco( File "/home/ggiret/miniconda3/envs/deeplabel/lib/python3.10/site-packages/supervision/dataset/core.py", line 457, in as_coco save_coco_annotations( File "/home/ggiret/miniconda3/envs/deeplabel/lib/python3.10/site-packages/supervision/dataset/formats/coco.py", line 218, in save_coco_annotations coco_annotation, annotation_id = detections_to_coco_annotations( File "/home/ggiret/miniconda3/envs/deeplabel/lib/python3.10/site-packages/supervision/dataset/formats/coco.py", line 114, in detections_to_coco_annotations approximate_mask_with_polygons( IndexError: list index out of range

Youho99 commented 7 months ago

Could modifying autodistill's label function theoretically fix the problem?

By ensuring that the base conversion is done in COCO and not in YOLO (COCO has more information, so it is simpler to convert COCO to another annotation than the reverse).

# By modifying this code for example, with as_coco instead of as_yolo
         dataset.as_yolo(
             output_folder + "/images",
             output_folder + "/annotations",
             min_image_area_percentage=0.01,
             data_yaml_path=output_folder + "/data.yaml",
         )

Could this work?

Youho99 commented 7 months ago

No longer working with the from_yolo function, I close the issue. I will reopen this issue if I notice an actual bug with this function.

roboflow / supervision