yuwenmichael / Grounding-DINO-Batch-Inference

Support batch inference of Grounding DINO. "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
2 stars 0 forks source link

Examples in the batch are not processed independently? #5

Open FiorenzoParascandolo1 opened 4 months ago

FiorenzoParascandolo1 commented 4 months ago

Hi everyone, I discovered this strange behavior.

I have 4 images (img_1, img_2, img_3, img_4)

if I run groundingDINO with the following prompts: ["distance . mountains . valley . view .", "man . snow board . trick", "lunch . pizza . they .", "kitchen . refrigerator . "]

then I obtain the following probabilities per class:

{0: {'distance': 0.19821932911872864, 'mountains': 0.7135314345359802, 'valley': 0.42435237765312195, 'view': 0.38242971897125244}, 1: {'man': 0.3701115548610687, 'snow board': 0.31612950563430786, 'trick': 0.21027937531471252}, 2: {'lunch': 0.38231441378593445, 'pizza': 0.7270074486732483, 'they': 0.19436779618263245}, 3: {'kitchen': 0.6813028454780579, 'refrigerator': 0.5736187100410461}}

but if I add the class "pineapple" to the third prompt:

["distance . mountains . valley . view .", "man . snow board . trick", "lunch . pizza . they . pineapple . ", "kitchen . refrigerator . "]

then the probabilities associated with other elements in the batch also change.

{0: {'distance': 0.22729776799678802, 'mountains': 0.7141298651695251, 'valley': 0.43764322996139526, 'view': 0.367383748292923}, 1: {'man': 0.3758210241794586, 'snow board': 0.3222990036010742, 'trick': 0.21733753383159637}, 2: {'lunch': 0.3865318298339844, 'pineapple': 0.040617868304252625, 'pizza': 0.6494675278663635, 'they': 0.21683959662914276}, 3: {'kitchen': 0.6852126717567444, 'refrigerator': 0.5792219042778015}}

It seems the samples in the batch are not processed independently... Has anyone encountered the same problem or have any suggestions to fix it? Thanks in advance