Hello,
I am attempting to finetune the DETR-resnet50 using a custom dataset, reason why I was following your github tutorial (1) as well as Object detection with 🤗 Transformers (2) docs.
I want to apply some extra data augmentation, as done in (2). However, my dataset is already in COCO format. That is, I already have my JSON files train, val and test. Reason why a class that extends torchvision.datasets.CocoDetection as done in (1) was the best option to follow.
This is the baseline class to modify to add data augmentation
class CocoDetection(torchvision.datasets.CocoDetection):
def __init__(self, img_folder, processor, train=True):
ann_file = os.path.join(img_folder, "custom_train.json" if train else "custom_val.json")
super(CocoDetection, self).__init__(img_folder, ann_file)
self.processor = processor
def __getitem__(self, idx):
# read in PIL image and target in COCO format
# feel free to add data augmentation here before passing them to the next step
img, target = super(CocoDetection, self).__getitem__(idx)
# preprocess image and target (converting target to DETR format, resizing + normalization of both image and target)
image_id = self.ids[idx]
target = {'image_id': image_id, 'annotations': target}
encoding = self.processor(images=img, annotations=target, return_tensors="pt")
pixel_values = encoding["pixel_values"].squeeze() # remove batch dimension
target = encoding["labels"][0] # remove batch dimension
return pixel_values, target
The content of label_fields is quite confusing for me. Observe the following screenshot from 🤗Object detection:
They are setting label_fields=["category"], but category in that code snippet refers to the category_ids instead of the categories (classes) as shown in Albumentations docs:
Another question regarding the augmentations, where should be done the definition of the Composition of augmentations? We cannot access (as far as I know) to the label_fields in the __init__ (to be defined what exactly refers to this based on the answer to my previous question). Should be directly done in the __getitem__?
In (2) it is used the 🤗 with_transform to apply the transformations to both images and labels. How can I use that method reading the dataset using the class from above?
SUMARIZE
Would it be possible to make a more in-depth and clear tutorial about how to train model for object detection (and applying data augmentation to both images and labels) using a CocoDetection class to process it?
Any help on this would be highly appreciated.
Hello, I am attempting to finetune the DETR-resnet50 using a custom dataset, reason why I was following your github tutorial (1) as well as Object detection with 🤗 Transformers (2) docs. I want to apply some extra data augmentation, as done in (2). However, my dataset is already in COCO format. That is, I already have my JSON files train, val and test. Reason why a class that extends
torchvision.datasets.CocoDetection
as done in (1) was the best option to follow. This is the baseline class to modify to add data augmentationAn example of augmentations to apply:
The content of
label_fields
is quite confusing for me. Observe the following screenshot from 🤗Object detection:They are setting
label_fields=["category"]
, butcategory
in that code snippet refers to thecategory_ids
instead of the categories (classes) as shown in Albumentations docs:Another question regarding the augmentations, where should be done the definition of the Composition of augmentations? We cannot access (as far as I know) to the label_fields in the
__init__
(to be defined what exactly refers to this based on the answer to my previous question). Should be directly done in the__getitem__
? In (2) it is used the 🤗 with_transform to apply the transformations to both images and labels. How can I use that method reading the dataset using the class from above?SUMARIZE
Would it be possible to make a more in-depth and clear tutorial about how to train model for object detection (and applying data augmentation to both images and labels) using a CocoDetection class to process it? Any help on this would be highly appreciated.