Closed george0407 closed 1 year ago
Hi, the code is built upon BEVFormer. BEVFormer has both temporal version and spatial-only version. BEVFormer concatenates the current ref_2d and the aligned ref_2d_shift from another timestamp in its temporal version, while concatenating two same ref_2ds for the spatial-only version. Since TPVFormer does not use temporal information, we simply follow the practice of BEVFormer.
Thanks for your answer
Firstly appreciate your work, and I wonder why you concatenate the ref pts of hw for 2 layers as shown above. Thanks a lot.