mit-han-lab / bevfusion

[ICRA'23] BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation
https://bevfusion.mit.edu
Apache License 2.0
2.37k stars 427 forks source link

Image BEV feature was getting worse during training #587

Closed san9569 closed 6 months ago

san9569 commented 10 months ago

Hi

I'm trying to train the BEVFusion using a single image and its corresponding point cloud for my custom dataset.

But, the performance gap was not large between LiDAR and fusion model.

So, I visualized the image features at a specific iteration during training. I observed that image bev feature was getting worse. It means that It cannot represent the scene in BEV map. I think the depth estimation seems to be a problem.

iteration 0. image

iteration 1900 image

I think the image bev feature is enforced to be zero map during training because it didn't provide useful information for object detection.

Can you give me some advice? Thank you in advance.

gerardmartin2 commented 7 months ago

Hello @eugenebak, do you remeber the changes you have done to the code?

zhijian-liu commented 6 months ago

Adding support for custom dataset is beyond the scope of this codebase, and unfortunately, we don't have the bandwidth with such customized requests.