google-research-datasets / screen_annotation

The Screen Annotation dataset consists of pairs of mobile screenshots and their annotations. The annotations are in text format, and describe the UI elements present on the screen: their type, location, OCR text and a short description. It has been introduced in the paper `ScreenAI: A Vision-Language Model for UI and Infographics Understanding`.
46 stars 7 forks source link

How to get F1 score @ IoU=0.1? #3

Open luyy12 opened 1 month ago

luyy12 commented 1 month ago

Hi, thank you for sharing the valuable datasets. I wonder how to calculate the F1 score @ IoU=0.1 since there are language dscriptions about the UI entity and they could not be exactly matched.

gbaechler commented 3 weeks ago

For the object detection tasks (like Screen Annotation), only the UI class and the bounding box information is used, the text description is discarded.