longzw1997 / Open-GroundingDino

This is the third party implementation of the paper Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection.
MIT License
396 stars 62 forks source link

Discrepency between the model's predictions and the confidence scores #33

Closed Azure-107 closed 10 months ago

Azure-107 commented 10 months ago

Thank you so much for the amazing work!

I used your implementation to train a model on a custom dataset consisting of only 10 images for 500 epochs, during which I expected the model to be able to memorize the provided images. I then passed the same image I used for training and the weight obtained to the official grounding dino inference script to test its performance.

The model exhibited promising results by correctly drawing bounding boxes and accurately predicting the class. However, I observed a notable discrepancy in the confidence scores (as shown in the attached image). Despite the model's correct predictions, the confidence scores were unexpectedly low.

I am wondering if you could kindly provide any guidance or suggestions on why there might be such a difference between the model's predictions and the confidence scores. Any insights would be greatly appreciated. Thank you so much for your time and support :))

annotated_image

BIGBALLON commented 10 months ago