Closed cc13qq closed 6 months ago
The evaluation for image_annotation is identical to text_choice, since they are using the same choice candidates. Regarding element_attributes, we have also described in the paper. You might also freely improve the search algorithm of element_attributes applied in the paper, or combine it with other grounding strategies. (See more discussions I replied in #14)
For the future upgrades of SeeAct codebase, I replied in #14.
Thanks for your interest in our work!
Thank you for your newly updated files! I have successfully generated screenshots.
However, I'm still curious about how to evaluate the element attributes and image annotations since the outputs of SeeAct are not predicted choices. I didn't find the ground truth of these two splits.
Are you also planning to release the evaluation code for element attributes and image annotations?
Look forward to your reply.