Closed LiewFeng closed 2 years ago
Our implementations work better than predicting bbox offset once only in our experiments.
Thank you for your quick reply! Could you kindly provide the detailed performance? I'm interested in it since an intuition is that normalized input would work better.
Both updates work in unnormalized space. The main difference is the detach() operation, which may affect the performance.
Got. Thanks!
The model predicts bbox offsets twice, in [DAB-DETR/blob/main/models/DAB_DETR/transformer.py, Line 255-265](https://github.com/IDEA-opensource/DAB-DETR/blob/main/models/DAB_DETR/transformer.py#:~:text=if%20self.bbox_embed,reference_points%20%3D%20new_reference_points.detach()), in DAB-DETR/blob/main/models/DAB_DETR/DABDETR.py, Line 171-184. The difference is that the input of the second predict is normed. The second predict seems unnecessary. Am I right?