How to load custom dataset in DETR format

I'm creating a custom dataloader for DETR which use the below file directory for training.

path/to/coco/

├ train
│  ├ annotations
│  │  └instance_default.json
│  ├ images
│  │  └img-001.jpg
│  │  └img-002.jpg
│  │  └...
├ test
├ valid

for this I was able to create one using huggingface DetrFeatureExtractor using below code

from transformers import DetrFeatureExtractor

class DetrData(CocoDetection):
    def __init__(self, img_folder, annotations, feature_extractor, train=True):
        super(DetrData, self).__init__(img_folder, annotations)
        self.feature_extractor = feature_extractor

    def __getitem__(self, idx):
      img, target = super(DetrData, self).__getitem__(idx)
      image_id = self.ids[idx]
      target = {'image_id': image_id, 'annotations': target}
      encoding = self.feature_extractor(images=img, annotations=target, return_tensors="pt")
      encoding["pixel_values"] = encoding["pixel_values"].squeeze() # remove batch dimension
      encoding["labels"] = encoding["labels"][0] # remove batch dimension
      return encoding

My Question Say if my image directory is like this

├ train
│  ├ class1
│  │  ├ annotations
│  │  │  └instance_default.json
│  │  ├ images
│  │  │  └img-001.jpg
│  │  │  └img-002.jpg
│  │  └...
│  ├ class2
│  │  ├ annotations
│  │  │  └instance_default.json
│  │  ├ images
│  │  │  └img-001.jpg
│  │  │  └img-002.jpg
│  │  └...
│  ├ class'n'
├ test
├ valid

I tried using Glob and Path to create the dataset but it only had the final class after execution . Below is the code I'm using for it.

class DetrData(CocoDetection):
    def __init__(self, img_folder, annotations, feature_extractor, train=True):
        files = Path(img_folder).glob('*')
        test_list= []
        for i in files:
            sub_img_folder = os.path.join(str(i)+ "/images")
            annotations = os.path.join(str(i)+"/annotations/instances_default.json")
            super(DetrData, self).__init__(sub_img_folder, annotations)
            self.feature_extractor = feature_extractor
            self.test_list = test_list

    def __getitem__(self, idx):
      img, target = super(DetrData, self).__getitem__(idx)
      image_id = self.ids[idx]
      target = {'image_id': image_id, 'annotations': target}
      encoding = self.feature_extractor(images=img, annotations=target, return_tensors="pt")
      encoding["pixel_values"] = encoding["pixel_values"].squeeze() # remove batch dimension
      encoding["labels"] = encoding["labels"][0] # remove batch dimension
      self.test_list.append(encoding)
      return test_list

any help would on this would be great

Torchvision's CocoDetection dataset expects just a single root and annFile, as can be seen here. In other words, even if you have multiple classes, you'll need to create the following directory structure:

path/to/coco/

├ train
│  ├ annotations
│  │  └instance_default.json
│  ├ images
│  │  └img-001.jpg
│  │  └img-002.jpg
│  │  └...
├ test
├ valid

This is the COCO format. So I'd advise you to just create a single JSON file which contains the annotations for all images, and rename the file names of the images such that they are all unique. You can for example name them class1-img-001.png, etc. Make sure to use the right IDs in the annotation JSON file. This blog post for example is very helpful.

facebookresearch / detr

How to load custom dataset in DETR format #430