too many unknown object with high confidence

Hi @BigBuffa1o,

I believe you need to do some hyperparameter tuning here, as this dataset is quite different compared to COCO. Some pointers:

pred_per_image: this controls the number of predictions per image you are making. I would defiantly bring this down to make fewer predictions per image. This should roughly scale with the number of classes (not linearly), e.g., for LVIS with 1203 classes, it is common practice to select 300. For COCO with 80 classes, it is common to select 100; here, with 7 classes, you need to really bring this down. My rough guess is ~20-50, but don't hold me to it. You can edit this here: https://github.com/orrzohar/PROB/blob/09d814353933f8bdfe944c8afd4b57b8416ebc8e/models/prob_deformable_detr.py#L514 or you can introduce this parameter to the args parser, and add it here: https://github.com/orrzohar/PROB/blob/09d814353933f8bdfe944c8afd4b57b8416ebc8e/models/prob_deformable_detr.py#LL639C32-L639C32
objectness temperature: you probably need to tune the objectness temperature to optimize the confidence between the known and unknown objects. You ideally want the confidence to be similar, with obvious known classes having slightly higher confidence than the known ones
You should probably introduce a threshold for the detections. Right now, the model always detects 'pred_per_image' per image. This is to remain consistent with prior work. However, in realistic situations, I believe that one has to introduce a threshold as well. This has to do with how mAP is calculated, which does not penalize wrong predictions if you made a right prediction with higher confidence.
num_queries: this controls how many queries are used in the model. As you have fewer, bigger objects, you could probably get away with reducing this as well.

The good news? you can tune 1-3 on an already trained model, i.e., without needing to retrain the model, as these parameters affect the inference, not training. 4 might be important to get the best results, but I would try 1-3 first to see if you get adequate performance.

Hope this helps! Orr

orrzohar / PROB

too many unknown object with high confidence #19