关于SparseBox3DRefinementModule中对速度处理的疑惑

commc commented 4 months ago

https://github.com/HorizonRobotics/Sparse4D/blob/c41df4bbf7bc82490f11ff55173abfcb3fb91425/projects/mmdet3d_plugin/models/detection3d/detection3d_blocks.py#L140-L145

在您的SparseBox3DRefinementModule中，对速度进行了处理，我对这两句代码比较疑惑： translation = torch.transpose(output[..., VX:], 0, -1) velocity = torch.transpose(translation / time_interval, 0, -1) 这里的translation指的是什么呢？我的理解是速度，这个理解是否正确呢？

接下来，这里为什么又要用translation除以 time_interval呢？ velocity = torch.transpose(translation / time_interval, 0, -1) 您能否指教一下，不胜感激！

linxuewu commented 4 months ago

网络输出的那个一开始是位移translation，然后除以时间间隔，得到最终的输出velocity。

linxuewu commented 4 months ago

因为网络输入的是视频图像，从连续两帧的图像中能直接估计出来的是位移，速度和时间间隔耦合了，所以在网络输出位移之后再除以时间间隔。

commc commented 4 months ago

感谢解答，明白了！

HorizonRobotics / Sparse4D

关于SparseBox3DRefinementModule中对速度处理的疑惑 #70