shikras / d-cube

A detection/segmentation dataset with labels characterized by intricate and flexible expressions. "Described Object Detection: Liberating Object Detection with Flexible Expressions" (NeurIPS 2023).
https://arxiv.org/abs/2307.12813
Other
105 stars 7 forks source link

about evaluate #2

Closed JunL-Geek closed 1 year ago

JunL-Geek commented 1 year ago

for the output format, a sentence maybe predicts one box or more,is it right? for a sentence,it's output format should contain all the boxes that should be predicted ?

Charles-Xie commented 1 year ago

Hi, sorry for the late response. I see that the issue is closed but I will try to answer it here. In our task, the model can predict zero, one or multiple box for one sentence (i.e., one category) in one image. The output format is a json file that contains a list, and each item in this list is one predicted target. Please see this documentation (https://github.com/shikras/d-cube/blob/main/doc.md#output-format) for more details. Thanks for your interest in our work and we hope this answer would clarify your question. Feel free to reopen the issue if the answer is not clear.