XuyangBai / TransFusion

[PyTorch] Official implementation of CVPR2022 paper "TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers". https://arxiv.org/abs/2203.11496
Apache License 2.0
619 stars 76 forks source link

loss bbox not normal when x,y ranges are inconsistent #17

Open study1994 opened 2 years ago

study1994 commented 2 years ago

I am trying to train transfusion-L and transfusion-LC on my data,when i train like this

point_cloud_range = [-70.4, -51.2, -2.0, 70.4, 51.2, 4.0]
type='PointPillarsScatter', in_channels=64, output_shape=(512,704)
grid_size=[704, 512, 1]

drop in losses is abnormal like this B12C4ADA-85B0-4535-9D2F-C2360796D130 training results do not converge but mmdetection3d can do well while point_cloud_range = [-51.2, -51.2, -2.0, 51.2, 51.2, 4.0] in tranfusion .it alse can do well. I can't find a bug in the code。 Can you give me some advice?

XuyangBai commented 2 years ago

It looks normal if it is at the beginning of the training, Why did you say drop in losses is abnormal?

study1994 commented 2 years ago

At the end of training, loss is also very big,The following is a log comparison of two different settings B12C4ADA-85B0-4535-9D2F-C2360796D130 Dingtalk_20220516092128 More detailed logs are here:链接:https://pan.baidu.com/s/12t6i7NxWQ56RKn8h91qz1g?pwd=lqo4 提取码:lqo4

XuyangBai commented 2 years ago

Hi, I check the code and think the potential reason might be the matching cost BBoxBEVL1Cost used in our label assigner: https://github.com/XuyangBai/TransFusion/blob/53370467c1b88f163cbe7b7300a1f588a6761e35/mmdet3d/core/bbox/assigners/hungarian_assigner.py#L25-L36 It calculates the error between prediction bbox and the ground truth where the box size is normalized by the pc_range, so when x,y ranges are inconsistent, their importance will also be different in label assignment. That might leads to a noisy label assignment result because the network tends to choose a prediction with better y prediction as the positive while somewhat ignoring the x prediction.

A quick way to verify this idea is to set the range to [-70.4, -70.4, -2.0, 70.4, 70.4, 4.0]. An alternative is to use BBoxL1Cost instead, which uses the absolute error between predictions and ground truth. But you need to tune with the weight for reg_cost, I have used this one but do not remember exactly the specific value.

Hope that helps.

study1994 commented 2 years ago

I'll try it,thank you!

ChristopheZhao commented 2 years ago

My data only has forward radar scan data, similiar with kitti dataset,so there is a big difference between point range and waymo, especially in the dimension of the x-axis, how should this be modified?

AmazingRoad commented 2 years ago

My data only has forward radar scan data, similiar with kitti dataset,so there is a big difference between point range and waymo, especially in the dimension of the x-axis, how should this be modified?

any solutions?