Open sumanttyagi opened 11 months ago
if you use the [detection] identifier, it is expected to extract all the bboxs for the car label.
which pretrained checkpoint are you loading? is it "MiniGPT-v2 (online developing demo)"? This checkpoint should give you more expected output.
Yes i am using MiniGPT-v2 (online developing demo).
while doing this i observed that in VQA it can answer that there are 3 cars in image, but while making detection it is only showing 1 . I wonder how it can predict 3 in vqa but not able to detect all the cars .
Sometimes the model is sensitive to the singular and plural forms of inputs, such as 'car' and 'cars'. Maybe you can also explore different prompts.
what i have observed while running the pre trained model making detection . I was not able to make multiple detection say i want to detect a car but it will only detect one car not all the cars ?