Duplicate entries and size

lechnermichael commented 11 months ago

Thanks for sharing the dataset. I have a question regarding the annotations itsellf, looking at any entry for an image in the rico dataset there are multiple similar boxes with only the first entry appearing to be the correct one all other only have a different xmin coordinate

'screen_id': 31314, 'screen_elements:[ {'xmin': 0.9351851940155029, 'ymin': 0.11874999850988388, 'xmax': 0.949999988079071, 'ymax': 0.14687499403953552, 'label': 'ICON_THREE_DOTS:MORE'},

{'xmin': 0.48240742087364197, 'ymin': 0.11874999850988388, 'xmax': 0.949999988079071, 'ymax': 0.14687499403953552, 'label': 'ICON_THREE_DOTS:MORE'},

{'xmin': 0.27222222089767456, 'ymin': 0.11874999850988388, 'xmax': 0.949999988079071, 'ymax': 0.14687499403953552, 'label': 'ICON_THREE_DOTS:MORE'} ]

What are the other annotations for? After removing the other boxes i am left only one annotation for each image in any of the three categories (iconnet, semantic, grouping), which leaves me at ~ 44k annotations per category. For the shape annotations you state in the paper to have 350k instances. Does this number include the incorrect bounding boxes?

YiDa858 commented 9 months ago

I have the same question. There are many bboxes that do not look visually effective. What should I do with them? @ssunkara1

cjfcsjt commented 6 months ago

I have the same question. What should I do with them? @ssunkara1

changquanyou commented 4 months ago

I have the same question. What should I do with them? @cjfcsjt @YiDa858 @lechnermichael @ssunkara1

ssunkara1 commented 2 months ago

Thanks a lot for raising this issue. I just fixed the data so that the repeated coordinates issue is no longer present. Please have a look and let me know if this looks better.

Many many apologies for the delay in reply here. :(

google-research-datasets / rico_semantics

Duplicate entries and size #1