fundamentalvision / Deformable-DETR

Deformable DETR: Deformable Transformers for End-to-End Object Detection.
Apache License 2.0
3.22k stars 520 forks source link

The Meaning of Variables (src_valid_ratios) #197

Open haiyang426 opened 1 year ago

haiyang426 commented 1 year ago

Why does the code need to multiply the (src_valid_ratios) in each layer of the decoder at this location, does this have any impact? Should it only be needed once?

        if reference_points.shape[-1] == 4:
            reference_points_input = reference_points[:, :, None] \
                                     * torch.cat([src_valid_ratios, src_valid_ratios], -1)[:, None]
        else:
            assert reference_points.shape[-1] == 2
            reference_points_input = reference_points[:, :, None] * src_valid_ratios[:, None]