lyuwenyu / RT-DETR

[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥
Apache License 2.0
2.31k stars 258 forks source link

Obtain labels in JSON format #262

Open erminkev1 opened 5 months ago

erminkev1 commented 5 months ago

Hello!

Thank you for this super awesome project.

I was wondering if there is a possibility to obtain COCO-like labels from the evaluation of the models. I've done it manually, and my setup ensured that a) Image resizing follows the specification in dataloader, and b) the image size i.e (1280x768) matches the eval_spatial_size, as it was during training time.

Code example:

get_labels.py

model = Model()
for img in os.listdir(IMAGE_PATH):
    path_to_image = os.path.join(IMAGE_PATH, img)
    mapped_id = mapping[img]
        im = Image.open(path_to_image).convert('RGB')
        im = im.resize((768,1280))
        image_tensor = ToTensor()(im)[None]
        size = torch.tensor([[768, 1280]])

        labels, boxes, scores = model(image_tensor.float(), size)
        thresh = 0
        scr = scores[0]
        filter_score = scr>thresh
        label = labels[0][filter_score]
        box = boxes[0][filter_score]
        scr = scr[filter_score]

# Scale differently depending on the config and the image size
# i.e org image (1280x720)

        SCALEx = 720/768
        SCALEy = 1280/1280

        for num, box_xyxy in enumerate(box):
            box_xywh = box_xyxy.tolist()
            box_xywh = [box_xywh[0]*SCALEx, box_xywh[1]*SCALEy, box_xywh[2]*SCALEx, box_xywh[3]*SCALEy]
            box_xywh = [box_xywh[0], box_xywh[1], (box_xywh[2]-box_xywh[0]), (box_xywh[3]-box_xywh[1])]
            box_xywh = list(map(int, box_xywh))
            score = scr[num].item()
            TEMP_DICT = {"image_id":mapped_id, "category_id":0, "bbox":box_xywh, "score":score}
            DICT_LIST.append(TEMP_DICT)

with open(os.path.join(SAVEPATH, 'predictions-detr.json'), 'w') as foo:
json.dump(DICT_LIST, foo)

This works to save the labels, and then you can run COCO eval. protocol yourself, but the mAP values that I obtain are about 1-2% lower than when i run train.py with --test-only flag, and ensure that in valid dataloader i have a proper set :).

Is there something how detr scales the labels in the valid. process that i am missing?