sovit-123 / vision_transformers

Vision Transformers for image classification, image segmentation, and object detection.
MIT License
43 stars 7 forks source link

Custom Dataset training #22

Closed Subhankhan41 closed 2 months ago

Subhankhan41 commented 2 months ago

Hi Sovit,

Thanks for making the vision_transformers repository. I am relatively new to AI field so learning from a number of soruces, especially your demos are great.

I have a dataset with labels in .txt and images in jpg for both train and validation. I can convert .txt to .xml or .json. However, I am a bit confused on the background class as my dataset has 3 classes, such as ['objectA', 'objectB', 'objectC'], with labels for class ID are 0: objectA, 1:objectB, 2:objectC. Now, how can I use the background class? Do, I have to convert my class IDs for each label to have 0 reserved for background?

Now, if I want to perform custom dataset training, could you please let me know how I have to pass the data?

Thanks

sovit-123 commented 2 months ago

You just need to manage the final dataset configuration file properly. No need to modify the XML file again. The code will match the strings of the XML file with the data configuration file. Following is an example.

# Images and labels direcotry should be relative to train.py
TRAIN_DIR_IMAGES: '../../object_detection/input/Aquarium Combined.v2-raw-1024.voc/train'
TRAIN_DIR_LABELS: '../../object_detection/input/Aquarium Combined.v2-raw-1024.voc/train'
VALID_DIR_IMAGES: '../../object_detection/input/Aquarium Combined.v2-raw-1024.voc/valid'
VALID_DIR_LABELS: '../../object_detection/input/Aquarium Combined.v2-raw-1024.voc/valid'

# Class names.
CLASSES: [
    '__background__',
    'fish', 'jellyfish', 'penguin',
    'shark', 'puffin', 'stingray',
    'starfish'
]

# Number of classes (object classes + 1 for background).
NC: 8

# Whether to save the predictions of the validation set while training.
SAVE_VALID_PREDICTION_IMAGES: True
Subhankhan41 commented 2 months ago

Thanks for the response. However, I am a bit confused on the background class as my dataset has 3 classes, such as ['objectA', 'objectB', 'objectC'], with labels for class ID are 0: objectA, 1:objectB, 2:objectC. Now, how can I use the background class? Do, I have to convert my class IDs for each label to have 0 reserved for background?

sovit-123 commented 2 months ago

No, you do not need to convert the IDs. The code will handle that internally. It check the dataset configuration file for the index position. So, __background__ will automatically become 0.

Subhankhan41 commented 2 months ago

Thanks Sovit. I have tested it and it's working well, great job. RT-DETR2 has also released, can't wait for your new tutorial/repo on it too. Some additional ideas, you can also try (https://github.com/facebookresearch/schedule_free) schedulfree to use optimizers i.e. optimizer = schedulefree.AdamWScheduleFree(params, lr=0.001). I noticed that it can improve the mAP significantly compared using schedulers.

sovit-123 commented 2 months ago

Thanks for the suggestion. I will surely take them up. Hopefully this issue is solved. I am closing this issue for now. If needed, please re-open.