fundamentalvision / Deformable-DETR

Deformable DETR: Deformable Transformers for End-to-End Object Detection.
Apache License 2.0
3.23k stars 520 forks source link

Query about Bounding Box Head as Relative Offsets #89

Open gopi-erabati opened 3 years ago

gopi-erabati commented 3 years ago

In A.3 of the paper,

Since the multi-scale deformable attention module extracts image features around the reference point, we design the detection head to predict the bounding box as relative offsets w.r.t. the reference point to further reduce the optimization difficulty.

Can you please explain me or point me to code, where and how the head is designed to predict the bounding box as relative offsets w.r.t. the reference point ? If we want to try to predict absolute bounding boxes (I understand that it was predicted as relative to reduce the optimization difficulty), is it possible without altering the MSDeformableAttention module ?