Closed yechenzhi closed 3 years ago
Hi, following previous work such as UNITER, we use the object proposal from Mask-RCNN, provided by "Mattnet: Modular attention network for referring expression comprehension."
Thank you, your work is amazing, It inspires me a lot, I really appreciate it.
In your grounding task, you used 'dets.json' to evaluate your results, how do you get the 'dets.json' file? which object detector did you use?