OpenDriveLab / Birds-eye-view-Perception

[IEEE T-PAMI] Awesome BEV perception research and cookbook for all level audience in autonomous diriving
https://doi.org/10.1109/TPAMI.2023.3333838
Apache License 2.0
1.18k stars 100 forks source link

论文疑问:bevdet没有使用激光雷达作为深度预测的监督 #26

Closed northeastsquare closed 10 months ago

northeastsquare commented 10 months ago

你好,论文 Delving into the Devils of Bird’s-eye-view Perception- A Review, Evaluation and Recipe, 中有下面这段话,意思是说bevdet是CaDDN的后续工作,使用激光雷达的深度信息,监督深度预测网络训练。实际上,bevdet里面,并没有使用激光雷达点云来监督深度预测。

The main difference between LSS [57] and CaDDN [46] is that CaDDN uses depth ground truth to supervise its categorical depth distribution prediction, thus owing a superior depth network to extract 3D information from 2D space. This track comes subsequent work such as BEVDet [47] and its temporal version BEVDet4D [64], BEVDepth [49]

faikit commented 10 months ago

Hi, thanks for your attention.

Actually, the depth supervision distinguishes CaDDN from LSS. BEVDet follows the works of LSS and CaDDN, but upgrades their paradigm by constructing an exclusive data augmentation strategy based on the decoupling effect of the view transformer.