Hi,

First, I want to thank you for your incredible work on CD-ViTO. I’m really excited to explore its capabilities for few-shot object detection! I’ve successfully gone through the initial setup and experiments, but now I’d like to train the model on my own custom dataset.

Could you please provide a step-by-step guide or any pointers on how to prepare and train CD-ViTO using custom data? Specifically, I would appreciate some insights on:

Dataset Preparation:

1.What format should the custom dataset be in (e.g., COCO, Pascal VOC)? 2.How should annotations be structured for compatibility with the model? 3.Any preprocessing steps required?

Configuration:

1.What changes need to be made to the configuration files (e.g., specifying the number of classes, dataset paths, etc.)? Training Pipeline:

2.How can I initiate training on my custom dataset? Any specific command-line arguments or scripts I should follow?

3.I would greatly appreciate it if you could walk me through the process or point me to relevant parts of the documentation (if available).

Thank you for your time and for creating this amazing tool!

Best regards, Aditya.

1. Convert Datasets to COCO Format

Before starting, ensure that your datasets are converted into the COCO format (JSON file). The COCO format is widely used for object detection tasks and consists of three main components:

images: Contains the metadata of the images (ID, file name, dimensions, etc.).

annotations: Contains the annotations for each image, including the bounding box coordinates, segmentation masks, and category IDs.

categories: Contains the list of object categories (classes).

You can use custom scripts or available tools to convert datasets from various formats (Pascal VOC, YOLO, etc.) into the COCO format. Once the datasets are in COCO format, proceed to the next step.

2. Split the Dataset into Training and Testing Sets

After converting your dataset into COCO format, use the provided split.py script to split the dataset into training and testing sets. The script randomly divides the dataset into the specified proportions (e.g., 80% for training and 20% for testing).

Here’s how to use the split.py:

1.Place your COCO format dataset (JSON file) in the appropriate directory.

2.Run the split.py command to split the dataset

This script will output two new JSON files: 1.train.json: Contains the training data. 2.test.json: Contains the testing data.

3. Select k-shot Samples from the Training Set

After splitting the dataset, you can select a suitable number of samples from the training set for k-shot learning. k-shot refers to selecting a small number (k) of examples for each category in the dataset.

To generate a k-shot dataset from the training set, follow these steps:

1.Use the training dataset generated from the split (e.g., train.json).

2.Run a custom script to extract k-shot samples for each category from the training dataset. You can set k to 1, 5 or 10 as needed for your k-shot learning task.（make sure always to be one annotation in a picture）

Once this process is complete, the kshot_train.json will contain the selected k-shot data, which can be used for training k-shot models.

4. Adding Custom Datasets

Before training, update these two files:lib/categories.py, detectron2/data/datasets/build.py. Then adding your datasets name in 'datasets_name'.

lovelyqian / CDFSOD-benchmark

Step-by-Step Guide for Training CD-ViTO on a Custom Dataset #10