Yzichen / FlashOCC

258 stars 28 forks source link

时序训练下,历史的bev feat如何对齐到当前帧 #56

Closed yanyangchun closed 1 month ago

yanyangchun commented 2 months ago

作者您好: 根据论文中的介绍,会对历史的bev feat做对齐,对齐到当前帧, 调试projects/configs/flashocc/flashocc-r50-4d-stereo.py,并进行阅读代码时, 发现得到前后帧的bev_feat_list后,就直接做了cat,没有去做对齐, 请教下,这块历史帧对齐到当前帧的代码是在哪儿实现的呢?

https://github.com/Yzichen/FlashOCC/blob/d64804d7bf4178b1a79284f50f53c1bbf79dd1aa/projects/mmdet3d_plugin/models/detectors/bevstereo4d.py#L220

Yzichen commented 1 month ago

https://github.com/Yzichen/FlashOCC/blob/d64804d7bf4178b1a79284f50f53c1bbf79dd1aa/projects/mmdet3d_plugin/models/detectors/bevdet4d.py#L241-L249

If _align_after_view_transfromation=True_ is not specified, the temporal alignment is done in the view transfromer. To do this, the transform matrix sensor2keyegos from the camera to the current ego system is computed for each frame, which can be seen in the above code.

yanyangchun commented 1 month ago

https://github.com/Yzichen/FlashOCC/blob/d64804d7bf4178b1a79284f50f53c1bbf79dd1aa/projects/mmdet3d_plugin/models/detectors/bevdet4d.py#L241-L249

If _align_after_view_transfromation=True_ is not specified, the temporal alignment is done in the view transfromer. To do this, the transform matrix sensor2keyegos from the camera to the current ego system is computed for each frame, which can be seen in the above code.

感谢解答,仔细看了代码后,发现是将第一帧(当前帧)的第一个view的ego2global作为keyego2global,然后历史帧的所有view投影到keyego上,相当于同时解决了历史帧对齐,多view对齐的工作。