xavibou / ovdsat

Official implementation of the paper 'Exploring Robust Features for Few-Shot Object Detection in Satellite Imagery'
Other
16 stars 2 forks source link

Inferencing for new classes #4

Closed wbcmthh42 closed 1 month ago

wbcmthh42 commented 2 months ago

Hi @xavibou , i have experimented with my own data containing 2 new classes: solar panel and greenroof, by adding them to the list of your current 15 classes (selected only a few images). I’ve used about 40:40:15 labeled objects per new class for the train:val:test.

Here are the commands used: python build_prototypes.py --data_dir "C:\\Users\\XXX\\Documents\\XXX\\ovdsat\\data\\simd\\train_XXX" --save_dir "C:\\Users\\XXX\\Documents\\XXX\\ovdsat\\run\\init_prototypes\\XXX\\XXX\\boxes\\simd_N5" --annotations_file "C:\\Users\\XXX\\Documents\\XXX\\ovdsat\\data\\simd\\train_coco_subset_N5_XXX.json" --backbone_type dinov2 --target_size 602 602 --window_size 224 --scale_factor 1 --num_b 10 --k 200 --store_bg_prototypes

python train.py --train_root_dir "C:\Users\XXX\Documents\XXX\ovdsat\data\simd\train_XXX" --val_root_dir "C:\Users\XXX\Documents\XXX\ovdsat\data\simd\val_XXX" --save_dir "C:\Users\XXX\Documents\XXX\ovdsat\run\train\XXX\XXX\boxes\simd_N5\e10_bs4" --train_annotations_file "C:\Users\XXX\Documents\XXX\ovdsat\data\simd\train_coco_subset_N5_XXX.json" --val_annotations_file "C:\Users\XXX\Documents\XXX\ovdsat\data\simd\train_coco_finetune_val_XXX.json" --prototypes_path "C:\Users\XXX\Documents\XXX\ovdsat\run\init_prototypes\XXX\XXX\boxes\simd_N5\dinov2\prototypes_dinov2.pt" --backbone_type dinov2 --num_epochs 10 --lr 2e-4 --target_size 602 602 --batch_size 4 --num_neg 0 --num_workers 8 --iou_thr 0.1 --conf_thres 0.2 --scale_factor 1 --annotations box --only_train_prototypes

python eval_detection.py --dataset simd --val_root_dir "C:\Users\XXX\Documents\XXX\ovdsat\data\simd\test_XXX" --save_dir "C:\Users\XXX\Documents\XXX\ovdsat\run\eval\XXX\XXX\detection\simd\backbone_dinov2_boxes\N5\test_XXX\e10_bs4" --val_annotations_file "C:\Users\XXX\Documents\XXX\ovdsat\data\simd\instances_default_test_XXX.json" --prototypes_path "C:\Users\XXX\Documents\XXX\ovdsat\run\train\XXX\XXX\boxes\simd_N5\e10_bs4\prototypes.pth" --bg_prototypes_path "C:\Users\XXX\Documents\XXX\ovdsat\run\init_prototypes\XXX\XXX\boxes\simd_N5\dinov2\bg_prototypes_dinov2.pt" --backbone_type dinov2 --classification box --target_size 602 602 --batch_size 2 --num_workers 8 --scale_factor 1

The train/val output after train.py:

Picture 1

And unseen test output after eval_detection.py:

Picture 2

With a few test images here (blue = ground truth, green = predictions). Picture 3 Picture 4 Picture 5

I’d like to understand:

  1. if the approach taken is correct to predict new custom classes, as the predictions look quite different from your original classes.
  2. Also, what is the minimum number of instances required to fine-tune (train + val, excluding test set) a new class?
  3. Would choosing the dior dataset be superior to simd in this case due to the similar nature of new classes?
xavibou commented 1 month ago

Hi, sorry for the late reply, here is my answer to the different points you raised:

  1. The training log seems weird to me, as some classes achieve a decent Acc and some others are set to 0. I would first suggest to reproduce the results reported with the original split. Once this is done and the training works properly for the original classes, adding a new class would just involve adding the new images and their annotations into the COCO format JSON annotation file. In addition, the last image you show seems like there are two helicopter classes (they are detected twice and shown with two different colors). This is very strange, make sure your labels and the categories in the JSON annotation files are consistent!

  2. The experiments reported in the article are done with 5, 10 and 30 examples per class. You can check the results provided on the different tables of the paper. The validation set provided contains around 15-20 examples per class, although it is not equal for all images.

  3. Indeed, if you objects of interest look more similar to the objects in the DIOR dataset, pre-training your RPN on the DIOR dataset should yield superior results. The fact that DIOR has a higher variety of objects with different aspects might also help in that regard

Hope this helps!