orrzohar / PROB

[CVPR 2023] Official Pytorch code for PROB: Probabilistic Objectness for Open World Object Detection
Apache License 2.0
110 stars 16 forks source link

too many unknown object with high confidence #19

Closed BigBuffa1o closed 1 year ago

BigBuffa1o commented 1 year ago

Hi.Thanks for the amamzing work! i have use your network by changing to my custom dataset with 7 category,without changing other parameters and trained it,it turns out that i can get the known object well,however there are many unknown object with high confidence,even after NMS it still looks not good,here is my visulization image

Btw when i train my data,it turns out the script shows many unknown object is detected too,is this normal?Looking forword to your reply. image

orrzohar commented 1 year ago

Hi @BigBuffa1o,

I believe you need to do some hyperparameter tuning here, as this dataset is quite different compared to COCO. Some pointers:

  1. pred_per_image: this controls the number of predictions per image you are making. I would defiantly bring this down to make fewer predictions per image. This should roughly scale with the number of classes (not linearly), e.g., for LVIS with 1203 classes, it is common practice to select 300. For COCO with 80 classes, it is common to select 100; here, with 7 classes, you need to really bring this down. My rough guess is ~20-50, but don't hold me to it. You can edit this here: https://github.com/orrzohar/PROB/blob/09d814353933f8bdfe944c8afd4b57b8416ebc8e/models/prob_deformable_detr.py#L514 or you can introduce this parameter to the args parser, and add it here: https://github.com/orrzohar/PROB/blob/09d814353933f8bdfe944c8afd4b57b8416ebc8e/models/prob_deformable_detr.py#LL639C32-L639C32

  2. objectness temperature: you probably need to tune the objectness temperature to optimize the confidence between the known and unknown objects. You ideally want the confidence to be similar, with obvious known classes having slightly higher confidence than the known ones

  3. You should probably introduce a threshold for the detections. Right now, the model always detects 'pred_per_image' per image. This is to remain consistent with prior work. However, in realistic situations, I believe that one has to introduce a threshold as well. This has to do with how mAP is calculated, which does not penalize wrong predictions if you made a right prediction with higher confidence.

  4. num_queries: this controls how many queries are used in the model. As you have fewer, bigger objects, you could probably get away with reducing this as well.

The good news? you can tune 1-3 on an already trained model, i.e., without needing to retrain the model, as these parameters affect the inference, not training. 4 might be important to get the best results, but I would try 1-3 first to see if you get adequate performance.

Hope this helps! Orr