Sense-GVT / Fast-BEV

Fast-BEV: A Fast and Strong Bird’s-Eye View Perception Baseline
Other
596 stars 91 forks source link

2d-to-3d on GPU #26

Closed Rookielike closed 1 year ago

Rookielike commented 1 year ago

Hi bro, for the case of multiple views with overlapping areas, you directly adopt the first encountered view to improve the speed of table building. But this will lead to no feature fusion in the overlapping area. Will this not lead to performance degradation?

ymlab commented 1 year ago

Some of our comparative experiments show that the accuracy loss caused by this modification is relatively low, only about 0.3~1 mAP, but the inference speed will be greatly improved. It is easy to understand that even if the feature fusion of overlapping areas is lost, there is always a camera feature introduced during training.

Rookielike commented 1 year ago

thx a lot

ymlab commented 1 year ago

From another perspective, we think that the cost of building 6 independent voxels to fuse these overlapping regions is too high, so instead of using some methods to speed up the fusion process, we choose to skip this step directly.