WangYueFt / detr3d

MIT License
745 stars 139 forks source link

Why you make pred box z = z - h/2 in head's get_bboxes #34

Open zhanglazhuan opened 2 years ago

zhanglazhuan commented 2 years ago

at detr3dhead class you make pred box tensor z = z - h/2 https://github.com/WangYueFt/detr3d/blob/34a47673011fe13593a3e594a376668acca8bddb/projects/mmdet3d_plugin/models/dense_heads/detr3d_head.py#L440

while in https://github.com/open-mmlab/mmdetection3d/blob/60ce864ff76af4316fb9ae56a2a5b7741bfdd9ab/mmdet3d/datasets/nuscenes_dataset.py#L288

make gt z = z - h/2, by use gt_bboxes_3d = LiDARInstance3DBoxes(gt_bboxes_3d, box_dim=gt_bboxes_3d.shape[-1], origin=(0.5, 0.5, 0.5)).convert_to(self.box_mode_3d)

but in pred https://github.com/WangYueFt/detr3d/blob/34a47673011fe13593a3e594a376668acca8bddb/projects/mmdet3d_plugin/models/dense_heads/detr3d_head.py#L441

the default LiDARInstance3DBoxes is origin=(0.5, 0.5, 0) . So, in detr3dhead get_bboxes, should be bboxes[:, 2] = bboxes[:, 2]

And if I change bboxes[:, 2] = bboxes[:, 2] - bboxes[:, 5] * 0.5 to bboxes[:, 2] = bboxes[:, 2], The eval of [DETR3D, ResNet101 w/ DCN] (https://github.com/WangYueFt/detr3d/blob/main/projects/configs/detr3d/detr3d_res101_gridmask.py)

Evaluating bboxes of pts_bbox mAP: 0.3460
mATE: 0.7654 mASE: 0.2678 mAOE: 0.3933 mAVE: 0.8726 mAAE: 0.2100 NDS: 0.4221

So, am I wrong or right?

a1600012888 commented 2 years ago

Hi. A good way to tell you are right or wrong is to do visualization. Using nuscenes detection evaulating cannot tell, because it compute all the results on BEV, where z does not matter.

I forget some details on preprocessing and coding the box, I only remember there a some piece of code where we represent the position by the z = z - h/2. I can check it later.

Yzichen commented 2 years ago

Here the bbox_target is converted into the form of gravity center, so in pred_bbox is also the form of gravity center. https://github.com/WangYueFt/detr3d/blob/34a47673011fe13593a3e594a376668acca8bddb/projects/mmdet3d_plugin/models/dense_heads/dgcnn3d_head.py#L420-L423 so later to be converted into the form of bottom_center, by bboxes[:, 2] = bboxes[:, 2] - bboxes[:, 5] * 0.5