XuyangBai / TransFusion

[PyTorch] Official implementation of CVPR2022 paper "TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers". https://arxiv.org/abs/2203.11496
Apache License 2.0
619 stars 76 forks source link

transfusion_nusc_voxel_LC training questions #35

Closed CeZh closed 2 years ago

CeZh commented 2 years ago

May I know how many epochs needed for the transfusion_nusc_voxel_LC setup? Hello, thanks for your impressive work! I'm currently trying to reproduce the results from the nuscene dataset use the transfusion_nusc_voxel_LC setup. May I know how many epochs required to converge? Thanks.

I have trainned the model for 5 epochs now but the results are still 0 for the mAP. I use the nuscene dataset and the same setup as you provided in the configuration doc.

XuyangBai commented 2 years ago

Hi, I assume you are directly training TransFusionLC from starch. I basically follow a 2-stage training pipeline. First train the TransFusionL for 20 epochs, then train the TransFusionLC for another 6 epoch using the pretrained LiDAR backbone and image backbone. Details can be found here

CeZh commented 2 years ago

Thanks for your reply. So in summary I will train the transfusion_nusc_voxel_L.py at first for 20 epochs, then, train the TransFusionLC.py for the remaining 6 epochs?

XuyangBai commented 2 years ago

Yes, exactly. And there is a data-augmentation adopted in the training of TransFusionL:

For the fade strategy proposed by PointAugmenting(disenable the copy-and-paste augmentation for the last 5 epochs), we currently implement this strategy by manually stop training at 15 epoch and resume the training without copy-and-paste augmentation. If you find more elegant ways to implement such strategy, please let we know and we really appreciate it. The fade strategy reduces lots of false positive, improving the mAP remarkably especially for TransFusion-L while having less influence on TransFusion.

CeZh commented 2 years ago

Thank you