OpenDriveLab / ViDAR

[CVPR 2024 Highlight] Visual Point Cloud Forecasting
https://arxiv.org/abs/2312.17655
Apache License 2.0
266 stars 17 forks source link

About reference points in temporal cross-attention #35

Closed artofstate closed 2 months ago

artofstate commented 3 months ago

Hello! I encountered some confusion while reading the code, for example, in the temporal cross-attention part of the future decoder, why are the future BEV queries coordinates in historical frames calculated as reference points? The key and value are current bev feature,right?

tomztyang commented 3 months ago

In case I understand your issue correctly, could you please provide some code parts as reference?

artofstate commented 3 months ago

https://github.com/OpenDriveLab/ViDAR/blob/936dbf7e010189b68b83b4b61568cfd0fa23e655/projects/mmdet3d_plugin/bevformer/modules/vidar_decoder.py#L253

tomztyang commented 3 months ago

Keys and values are all previous bev features.