Vision-CAIR / MiniGPT-4

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
https://minigpt-4.github.io
BSD 3-Clause "New" or "Revised" License
25.4k stars 2.91k forks source link

MULTIPLE OBJECT in the images are not getting detected #451

Open sumanttyagi opened 11 months ago

sumanttyagi commented 11 months ago

what i have observed while running the pre trained model making detection . I was not able to make multiple detection say i want to detect a car but it will only detect one car not all the cars ?

junchen14 commented 11 months ago

if you use the [detection] identifier, it is expected to extract all the bboxs for the car label.

which pretrained checkpoint are you loading? is it "MiniGPT-v2 (online developing demo)"? This checkpoint should give you more expected output.

sumanttyagi commented 11 months ago

Yes i am using MiniGPT-v2 (online developing demo).

while doing this i observed that in VQA it can answer that there are 3 cars in image, but while making detection it is only showing 1 . I wonder how it can predict 3 in vqa but not able to detect all the cars .

junchen14 commented 11 months ago

Sometimes the model is sensitive to the singular and plural forms of inputs, such as 'car' and 'cars'. Maybe you can also explore different prompts.