google-research-datasets / rico_semantics

Consists of ~500k human annotations on the RICO dataset identifying various icons based on their shapes and semantics, and associations between selected general UI elements and their text labels. Annotations also include human annotated bounding boxes which are more accurate and have a greater coverage of UI elements.
Creative Commons Attribution Share Alike 4.0 International
19 stars 3 forks source link

Duplicate entries and size #1

Open lechnermichael opened 11 months ago

lechnermichael commented 11 months ago

Thanks for sharing the dataset. I have a question regarding the annotations itsellf, looking at any entry for an image in the rico dataset there are multiple similar boxes with only the first entry appearing to be the correct one all other only have a different xmin coordinate

'screen_id': 31314, 'screen_elements:[ {'xmin': 0.9351851940155029, 'ymin': 0.11874999850988388, 'xmax': 0.949999988079071, 'ymax': 0.14687499403953552, 'label': 'ICON_THREE_DOTS:MORE'},

{'xmin': 0.48240742087364197, 'ymin': 0.11874999850988388, 'xmax': 0.949999988079071, 'ymax': 0.14687499403953552, 'label': 'ICON_THREE_DOTS:MORE'},

{'xmin': 0.27222222089767456, 'ymin': 0.11874999850988388, 'xmax': 0.949999988079071, 'ymax': 0.14687499403953552, 'label': 'ICON_THREE_DOTS:MORE'} ]

What are the other annotations for? After removing the other boxes i am left only one annotation for each image in any of the three categories (iconnet, semantic, grouping), which leaves me at ~ 44k annotations per category. For the shape annotations you state in the paper to have 350k instances. Does this number include the incorrect bounding boxes?

YiDa858 commented 9 months ago

I have the same question. There are many bboxes that do not look visually effective. What should I do with them? @ssunkara1

cjfcsjt commented 6 months ago

I have the same question. What should I do with them? @ssunkara1

changquanyou commented 4 months ago

I have the same question. What should I do with them? @cjfcsjt @YiDa858 @lechnermichael @ssunkara1

ssunkara1 commented 2 months ago

Thanks a lot for raising this issue. I just fixed the data so that the repeated coordinates issue is no longer present. Please have a look and let me know if this looks better.

Many many apologies for the delay in reply here. :(