Closed xu19971109 closed 8 months ago
Thanks for your interest.
This pipeline follows TransFusion and DeepInteraction. The LiDAR branch is initialized with TransFusion-L pretrained weights, but the camera branch is initialized randomly. Thus, the camera branch cannot be frozen. We also tried to train all the parameters (including LiDAR branch) together, which leads to similar performance.
As you mentioned in your paper that the Lidar branch is frozen and img backbone loads from the pretrained model. So it's actually the img branch and the fusion branch that are updating the parameters. Why do you only freeze lidar branch?
Have you tried freezing img and training lidar and fusion? Or freezing img and lidar backbone and only update fusion parameters?