IDEA-Research / DAB-DETR

[ICLR 2022] Official implementation of the paper "DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR"
Apache License 2.0
501 stars 86 forks source link

cat(y, x) in gen_sineembed_for_position() #36

Closed yyccli closed 2 years ago

yyccli commented 2 years ago

Hi, first thanks for your great work to improve DETR-like detectors! I have a question on the cat() operation when we want to get query pos embedding using func gen_sineembed_for_position() in file models/DAB_DETR/transformer.py line 51 and 61. Why here use

pos = torch.cat((pos_y, pos_x)...)

should x come first than y ?

SlongLiu commented 2 years ago

Our design aims to be consistent with the positional encoding of image features. as shown in https://github.com/IDEA-opensource/DAB-DETR/blob/main/models/DAB_DETR/position_encoding.py#L58, the order is pos_y followed by pos_x.

yyccli commented 2 years ago

Hi, thank you for your reply. I understand now.