Open tangtaogo opened 2 years ago
Sorry for the late reply. GKT's kernel region is with fixed integer pixel coordinate and can be pre-computed. BEVFormer predicts offsets to sample points, similar to deformable-DETR. It's sampling positions are floating-point and dynamic.
Hi, Thanks for your open source work! But I have some doubts about the main differences between GKT and BEVFormer, except for the different selection of regions around the prior positions.