facebookresearch / detr

End-to-End Object Detection with Transformers
Apache License 2.0
13.63k stars 2.46k forks source link

Output mask is almost the same in all high confidence queries #483

Closed truncs closed 2 years ago

truncs commented 2 years ago

Instructions To Reproduce the Issue:

  1. what changes you made (git diff) or what code you wrote

I am fine tuning the coco detr checkpoint using my own dataset of 24 classes (the number of classes was modified in the script). The dataset is in coco format. Single image example:

 [{"image_id": 1, "file_name": "frame000105.png", "segments_info":  [{"id": 1, "category_id": 1, "area": 10948.0, "bbox": [425.0, 50.0, 171.0, 88.0], "iscrowd": 0}, {"id": 2, "category_id": 1, "area": 709.0, "bbox": [213.0, 74.0, 38.0, 25.0], "iscrowd": 0}, {"id": 3, "category_id": 1, "area": 576.0, "bbox": [47.0, 69.0, 44.0, 20.0], "iscrowd": 0}, {"id": 4, "category_id": 3, "area": 613.0, "bbox": [300.0, 98.0, 57.0, 23.0], "iscrowd": 0}, {"id": 5, "category_id": 15, "area": 159.0, "bbox": [314.0, 7.0, 11.0, 24.0], "iscrowd": 0}, {"id": 6, "category_id": 16, "area": 69.0, "bbox": [344.0, 115.0, 14.0, 7.0], "iscrowd": 0}, {"id": 7, "category_id": 16, "area": 88.0, "bbox": [303.0, 110.0, 26.0, 5.0], "iscrowd": 0}, {"id": 8, "category_id": 16, "area": 15620.0, "bbox": [279.0, 200.0, 320.0, 70.0], "iscrowd": 0}, {"id": 9, "category_id": 1, "area": 1037.0, "bbox": [0.0, 51.0, 40.0, 34.0], "iscrowd": 0}, {"id": 10, "category_id": 1, "area": 129.0, "bbox": [630.0, 118.0, 10.0, 20.0], "iscrowd": 0}, {"id": 11, "category_id": 1, "area": 239.0, "bbox": [109.0, 72.0, 17.0, 17.0], "iscrowd": 0}, {"id": 12, "category_id": 2, "area": 1360.0, "bbox": [345.0, 40.0, 30.0, 81.0], "iscrowd": 0}, {"id": 13, "category_id": 19, "area": 14462.0, "bbox": [206.0, 112.0, 274.0, 75.0], "iscrowd": 0}, {"id": 14, "category_id": 3, "area": 503.0, "bbox": [371.0, 110.0, 66.0, 13.0], "iscrowd": 0}, {"id": 15, "category_id": 3, "area": 100393.0, "bbox": [0.0, 187.0, 640.0, 213.0], "iscrowd": 0}, {"id": 16, "category_id": 16, "area": 56.0, "bbox": [370.0, 116.0, 10.0, 6.0], "iscrowd": 0}, {"id": 17, "category_id": 19, "area": 1718.0, "bbox": [49.0, 133.0, 88.0, 35.0], "iscrowd": 0}, {"id": 18, "category_id": 2, "area": 17095.0, "bbox": [93.0, 0.0, 133.0, 230.0], "iscrowd": 0}]}
  1. what exact command you run: For fine tuning the bounding boxes on modified checkpoint (without class weights) ran the following
python3 main.py --dataset_file coco_panoptic --coco_path panoptic/ --coco_panoptic_path panoptic/ --epochs 300 --lr=1e-4 --batch_size=8 --num_workers=4 --output_dir="outputs" --resume="detr-r50_no-class-head.pth"

For fine tuning segmentation

 python3 main.py --masks --dataset_file coco_panoptic --coco_path panoptic/ --coco_panoptic_path panoptic/ --epochs 25 --lr=1e-4 --lr_drop 15 --batch_size=4 --num_workers=4 --output_dir="segm" --frozen_weights outputs/checkpoint.pth
  1. what you observed (including full logs):

detr

Expected behavior:

Although the mask is accurate as a whole there is no instance separation.

alex-613 commented 1 year ago

Hi, I was wondering how this issue was resolved at the end? I have a custom dataset of 2 classes in COCO format, and I modified the number of classes in the script, ran it with the same commands as you, and got the same observations (aka. all correlation maps are identical). I used a pretrained r-50 model (trained with COCO), then retrained on my own dataset. Detection yielded perfectly acceptable results, and I froze the weights from that model for my segmentation task, but I was not able to get it to work.

FYI. In my dataset, there are two objects, one significantly bigger than the other, and the smaller one is always encapsulated within the larger one. The detection model was able to find both objects well, but segmentation did not work.