microsoft / SoM

Set-of-Mark Prompting for GPT-4V and LMMs
MIT License
1.11k stars 87 forks source link

Increase the number of objects/elements detected #28

Closed aakashb09 closed 5 months ago

aakashb09 commented 5 months ago

In my testing, I found that some areas are not marked for mobile/website screenshots. Is there a way to get it to mark more things?

Would finetuning Detectron2 on a custom dataset suit my use case?

IMG_4708

abrichr commented 5 months ago

Make sure you are setting the granularity to the maximum value.

In https://github.com/OpenAdaptAI/OpenAdapt/pull/610 we had good results by fist segmenting the screenshot, and processing each segment individually before recombining.