Closed JunL-Geek closed 1 year ago
Hi, sorry for the late response. I see that the issue is closed but I will try to answer it here. In our task, the model can predict zero, one or multiple box for one sentence (i.e., one category) in one image. The output format is a json file that contains a list, and each item in this list is one predicted target. Please see this documentation (https://github.com/shikras/d-cube/blob/main/doc.md#output-format) for more details. Thanks for your interest in our work and we hope this answer would clarify your question. Feel free to reopen the issue if the answer is not clear.
for the output format, a sentence maybe predicts one box or more,is it right? for a sentence,it's output format should contain all the boxes that should be predicted ?