microsoft / GLIP

Grounded Language-Image Pre-training
MIT License
2.18k stars 191 forks source link

COCO metric Evaluation with different prompts & getting inferences images? #94

Open ml-ai-cv opened 1 year ago

ml-ai-cv commented 1 year ago

Hello,

Thanks for the great work.

I have two questions:

  1. I am using test_grounding_net.py for evaluating the model performance on custom dataset, but I don't see inferenced images with bounding boxes in output folder. I only see coco_results.pth and predictions.pth.
  1. I trying to improve the performance on custom dataset. I have these lines in task config file. Since I am not able to see the bounding box on a image after evaluation so not able to understand how the prompt is working

    CAPTION_PROMPT: '[{"prefix": " ", "name": "a blue circle . a white circle . a gray circle", "suffix": ""}, ]' OVERRIDE_CATEGORY: '[{"id": 1, "name": " a white circle", "supercategory": "shapes"}]'

    • When I give "a white circle" as class in my ground truth json file, and give prompt as "a blue circle . a white circle . a gray circle", does it detect "a blue circle", "a white circle" and ""a gray circle"" as individual classes in image or consider them as "a white circle" class in OVERRIDE_CATEGORY and ground truth json file?
    • e.g. My data image quality is bad and I want to detect all the circles by giving prompt as white, gray and blue color as my class "a white circle" in ground truth json file. ( Here I am trying to increase the detection of a white circle by explicitly stating colors of circle in prompt)
    • Also, I tried giving prompts in "suffix" and "prefix" but it need to have a "name" in order to work properly and also giving prompt directly without prefix, name, suffix, it does not work .(e.g. CAPTION_PROMPT: '[{"a blue circle . a white circle . a gray circle"}, ]' does not work) Please let me know how the terminology of prefix, suffix and name works. I couldn't find any literature or tutorial about it. Can you please point me to something so I understand this terminology ad use it properly?

Thanks for your help!

### Tasks