cocodataset / cocoapi

COCO API - Dataset @ http://cocodataset.org/
Other
6.1k stars 3.76k forks source link

Using CocoEvaluator on custom datasets error #418

Closed EMCP closed 2 years ago

EMCP commented 4 years ago

I am extening the Pytorch coco dataloader, in order to support my custom coco tagged datasets.. and when I goto evaluate this data in pycocotools.. an assertion fails. Am I loading the data incorrectly ?

https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocotools/cocoeval.py

def loadRes(self, resFile):
    """
    Load result file and return a result api object.
    :param   resFile (str)     : file name of result file
    :return: res (obj)         : result api object
    """
    res = COCO()
    res.dataset['images'] = [img for img in self.dataset['images']]

    # print('Loading and preparing results...')
    # tic = time.time()
    if isinstance(resFile, torch._six.string_classes):
        anns = json.load(open(resFile))
    elif type(resFile) == np.ndarray:
        anns = self.loadNumpyAnnotations(resFile)
    else:
        anns = resFile
    assert type(anns) == list, 'results in not an array of objects'
    annsImgIds = [ann['image_id'] for ann in anns]
    assert set(annsImgIds) == (set(annsImgIds) & set(self.getImgIds())), \
        'Results do not correspond to current coco set'

getting AssertionError: Results do not correspond to current coco set

seems there is an image ID mismatch between my data ?

EMCP commented 4 years ago

for reference, here is my dataloader

import os

import torch
from torchvision.transforms import functional as F
from torchvision.datasets import CocoDetection
# https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html

class CustomCocoDataset(CocoDetection):

    def __init__(self, data_conf, model_conf, testing_mode_on=False):

        if testing_mode_on is False:
            target_data_set = data_conf["image_data_training_id"]
            coco_filename = data_conf["coco_annotations_training_id"]
        else:
            # This turns off returning the data in a tensor, so COCOEval can work
            target_data_set = data_conf["image_data_testing_id"]
            coco_filename = data_conf["coco_annotations_testing_id"]

        self.coco_data_root = os.path.join(data_conf["image_pool_path"],
                                           target_data_set + "/" +
                                           data_conf["image_data_sub_dir"])

        self.coco_annotations_file = os.path.join(data_conf["image_pool_path"],
                                                  target_data_set,
                                                  "annotations",
                                                  coco_filename + ".json")

        super(CustomCocoDataset, self).__init__(root=self.coco_data_root,
                                                annFile=self.coco_annotations_file)

        self.categories = data_conf["classes_available"]

        print("found " + str(self.get_num_categories()) + " categories in data at: " + str(self.coco_data_root))

    def get_num_categories(self):
        return len(self.categories)

    def __getitem__(self, item):

        (pil_image, targets) = super(CustomCocoDataset, self).__getitem__(item)

        # get bounding box coordinates for each mask
        num_targets = len(targets)
        boxes = []
        for i in range(num_targets):
            box = targets[i]["bbox"]
            xmin = box[0]
            xmax = box[0] + box[2]
            ymin = box[1]
            ymax = box[1] + box[3]
            boxes.append([xmin, ymin, xmax, ymax])

        # convert everything into a torch.Tensor, unless you're performing testing (ie COCOEval)

        boxes = torch.as_tensor(boxes, dtype=torch.float32)
        labels = torch.ones((num_targets,), dtype=torch.int64)
        image_id = torch.tensor([item])
        areas = []
        for i in range(num_targets):
            areas.append(targets[i]["area"])
        areas = torch.as_tensor(areas, dtype=torch.float32)
        iscrowd = torch.zeros((num_targets,), dtype=torch.int64)

        target = {}
        target["boxes"] = boxes
        target["labels"] = labels
        target["image_id"] = image_id
        target["area"] = areas
        target["iscrowd"] = iscrowd

        image = F.to_tensor(pil_image)

        return image, target

if __name__ == "__main__":
    import json

    with open("../../dataset_template.json") as f:
        config_json = json.load(f)

    with open("../../config.json") as fp:
        model_conf = json.load(fp)

    loader = CustomCocoDataset(data_conf=config_json, model_conf=model_conf)
    print(loader.__len__())
    print(loader.__getitem__(4))

    print("Done")
dorakementzey commented 1 year ago

Hi! Did you manage to fix this error? I have a similar issue with a custom dataset, but my result.json file has the same images and ids.

MarawanEldeib commented 9 months ago

@EMCP how you solved it?

gguzzy commented 1 month ago

Hi dear all, I had the same issue, and I solved using this specific script I created, if you want to look it up: https://github.com/gguzzy/coco-annotation-converter/blob/main/convert_coco_data_to_result This script converts from coco data format to result format, allowing us to properly run evaluation scripts from coco.

If you want to insert it as a merge request to cocoapi, please contact me!

I hope it helps, thanks!