exiawsh / StreamPETR

[ICCV 2023] StreamPETR: Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection
Other
588 stars 63 forks source link

[问题] gt_bboxes_3d是如何由sample_annotation.json转换得到的? #188

Open LiuJieShane opened 7 months ago

LiuJieShane commented 7 months ago

我尝试去理解目标框的ground truth,于是把data loader加载的一个sample的gt_bboxes_3d保存了下来,然后画到相应的相机图片上。

保存gt_bboxes_3d的位置: petr3d.py#L223

画图的方法 (我理解gt_bboxes_3d是lidar boxes 3d,是在sensor坐标系的,所以画到相机视角是要校准):

box_list = []
for box in boxes_mine:
    # Move from censor coordinate to world coordinate
    box.rotate(Quaternion(lidar_rotation))
    box.translate(np.array(lidar_translation))

    box.rotate(Quaternion(pose_record['rotation']))
    box.translate(np.array(pose_record['translation']))

    if use_flat_vehicle_coordinates:
        # Move box to ego vehicle coord system parallel to world z plane.
        yaw = Quaternion(pose_record['rotation']).yaw_pitch_roll[0]
        box.translate(-np.array(pose_record['translation']))
        box.rotate(Quaternion(scalar=np.cos(yaw / 2), vector=[0, 0, np.sin(yaw / 2)]).inverse)
    else:
        # Move box to ego vehicle coord system.
        box.translate(-np.array(pose_record['translation']))
        box.rotate(Quaternion(pose_record['rotation']).inverse)

        #  Move box to sensor coord system.
        box.translate(-np.array(cs_record['translation']))
        box.rotate(Quaternion(cs_record['rotation']).inverse)

    if sensor_record['modality'] == 'camera' and not \
            box_in_image(box, cam_intrinsic, imsize, vis_level=box_vis_level):
        continue

但是画出来的结果,似乎存在对不齐的现象: b993550e60054741983f8052ba97b0b0_v13

为了厘清为什么对不齐,我检查了一下数据,发现了两个问题:

exiawsh commented 6 months ago
  1. [1299.232, 918.868,1.568], "size": [2.908,10.909,4.454], "rotation": [0.9473110943049808,0.0,0.0,-0.3203149865471482]这些annoatation在nuscenes的global坐标系,需要转换到mmdet3d的lidar坐标系
  2. 不断变化是因为加了BEV上的增广,去掉应该就不会变化了,但是mAOE指标会下降。https://github.com/exiawsh/StreamPETR/blob/2315cf9f077817ec7089c87094ba8a63f76c2acf/projects/configs/StreamPETR/stream_petr_r50_flash_704_bs2_seq_24e.py#L173
LiuJieShane commented 6 months ago
  1. [1299.232, 918.868,1.568], "size": [2.908,10.909,4.454], "rotation": [0.9473110943049808,0.0,0.0,-0.3203149865471482]这些annoatation在nuscenes的global坐标系,需要转换到mmdet3d的lidar坐标系

@exiawsh 具体的转换方法?我已经把gt_bboxes_3d从lidar坐标系转回ego坐标系,再转回global坐标系,画出来的框再BEV视角下对不齐(如图第一行所示);画到相机图像上时,也已经先转到gloabl坐标系,再转到camera坐标系,还是对不齐(如图后三行所示)。