XuyangBai / TransFusion

[PyTorch] Official implementation of CVPR2022 paper "TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers". https://arxiv.org/abs/2203.11496
Apache License 2.0
619 stars 76 forks source link

possible explanation of no improvement on Waymo #50

Closed AndyYuan96 closed 2 years ago

AndyYuan96 commented 2 years ago

Hi, xuyang, after I success reproduce the performance on nuscene, I find that the main improvement of transfusion-l compared with centerpoint and transfusion-lc compared with transfusion-l is mAP, other indicator don't have too much improvement, so NDS's improvement come from mAP, as nuscene's map calculation is different with waymo, so maybe the improvement of transfusion don't have too much contribution to Waymo's map, so transfusion's performance is almost the same with centerpoint.

AndyYuan96 commented 2 years ago

what's more, I want to try transfusion head in mmdetection3d 1.0, as I write lots of code based on mmdetection3d 1.0, but after I copy transfusion head to mmdetection3d 1.0, I use load_interval=5 1/5 data to train to compared the performance with mmdetection3d 0.11, the map and nds drop around 2 point, after some expriment, I find the reason is init weight, as mmdet3d 1.0 use new init_weight function, after I use init weight from mmdet3d 0.11, the performance don't drop.

AndyYuan96 commented 2 years ago

what's more, I also get an interesting result. For pillar backbone, I can reproduce the result, and I want to find how much improvement that the transformer give, so I comment the transformer code, which is the code that fusion the query and feature map, and the result map and nds is lower that transfusion only 0.5 point, but transfusion did have improvement comapred with centerpoint, so I think maybe the main reason is that transfusion use iou loss but mmdet3d's centerpoint not, as iou loss did improve performance in many paper.

XuyangBai commented 2 years ago

what's more, I want to try transfusion head in mmdetection3d 1.0, as I write lots of code based on mmdetection3d 1.0, but after I copy transfusion head to mmdetection3d 1.0, I use load_interval=5 1/5 data to train to compared the performance with mmdetection3d 0.11, the map and nds drop around 2 point, after some expriment, I find the reason is init weight, as mmdet3d 1.0 use new init_weight function, after I use init weight from mmdet3d 0.11, the performance don't drop.

Interesting observations and thanks for your trial and sharing. I never tried mmdet3d 1.0, I am not sure whether such an initialization issue is also observed for other methods.

what's more, I also get an interesting result. For pillar backbone, I can reproduce the result, and I want to find how much improvement that the transformer give, so I comment the transformer code, which is the code that fusion the query and feature map, and the result map and nds is lower that transfusion only 0.5 point, but transfusion did have improvement comapred with centerpoint, so I think maybe the main reason is that transfusion use iou loss but mmdet3d's centerpoint not, as iou loss did improve performance in many paper.

TransFusion doesn't use iou loss:

https://github.com/XuyangBai/TransFusion/blob/399bda09a3b6449313ccc302df40651f77ec78bf/configs/transfusion_nusc_pillar_LC.py#L213-L216

AndyYuan96 commented 2 years ago

what's more, I want to try transfusion head in mmdetection3d 1.0, as I write lots of code based on mmdetection3d 1.0, but after I copy transfusion head to mmdetection3d 1.0, I use load_interval=5 1/5 data to train to compared the performance with mmdetection3d 0.11, the map and nds drop around 2 point, after some expriment, I find the reason is init weight, as mmdet3d 1.0 use new init_weight function, after I use init weight from mmdet3d 0.11, the performance don't drop.

Interesting observations and thanks for your trial and sharing. I never tried mmdet3d 1.0, I am not sure whether such an initialization issue is also observed for other methods.

what's more, I also get an interesting result. For pillar backbone, I can reproduce the result, and I want to find how much improvement that the transformer give, so I comment the transformer code, which is the code that fusion the query and feature map, and the result map and nds is lower that transfusion only 0.5 point, but transfusion did have improvement comapred with centerpoint, so I think maybe the main reason is that transfusion use iou loss but mmdet3d's centerpoint not, as iou loss did improve performance in many paper.

TransFusion doesn't use iou loss:

https://github.com/XuyangBai/TransFusion/blob/399bda09a3b6449313ccc302df40651f77ec78bf/configs/transfusion_nusc_pillar_LC.py#L213-L216

sorry, I don't read the code very carefully, as I see iou in loss function, so I think use of iou. anyway, I just want to share the result that after comment transformer, the performance don't have too much different, I think 0.5 point drop is normal fluctuation of training on val。and epoch20‘s result that without gtaug fade strategy are almost the same.

gopi-erabati commented 1 year ago

what's more, I want to try transfusion head in mmdetection3d 1.0, as I write lots of code based on mmdetection3d 1.0, but after I copy transfusion head to mmdetection3d 1.0, I use load_interval=5 1/5 data to train to compared the performance with mmdetection3d 0.11, the map and nds drop around 2 point, after some expriment, I find the reason is init weight, as mmdet3d 1.0 use new init_weight function, after I use init weight from mmdet3d 0.11, the performance don't drop.

Hey @AndyYuan96, thanks for your trial. I also want to try on mmdet1.0, I have some queries: Did you use mmdet1.0 for both code and to generate meta (*.pkl) files ? If yes, is there some code in TransFusionHead you change to adapt to mmdet3d1.0 coordinate system refactoring ? can you please help, thanks!