Closed liukc19 closed 4 months ago
Hi there, you can check if the target boxes are missed by the region proposer by commenting out the following lines, which draws every single box detected by ddetr after nms and thresholding.
I have tried re-training the model after changing the box format from [cx, cy, w, h] to [x1, y1, x2, y2] for roi_align input, as reported in #11. But the performances decreased on all benchmarks. Still working on how could this happen...
Hi, did you find out why the performance of the model decreased after aligning the bounding boxes? That is really weird. @machuofan
Thank you for your great work. I found that groma performs poorly on some object detection tasks, which makes it difficult for me to determine whether the problem occurs in the inference phase or the detection phase when using groma to do some complex VQA tasks. Do you plan to fix the bug of misaligned bounding box format when extracting region feature?