aharley / simple_bev

A Simple Baseline for BEV Perception
MIT License
457 stars 70 forks source link

Question about translation of Radar coordinates #47

Closed seamie6 closed 3 months ago

seamie6 commented 3 months ago

The radar points are got via the get_radar_data() function inside of nuscenesdataset.py and the points are translated into the egopose realtive to the RADAR_FRONT:

   # Get reference pose and timestamp.
    ref_sd_token = sample_rec['data']['RADAR_FRONT']
    ref_sd_rec = nusc.get('sample_data', ref_sd_token)
    ref_pose_rec = nusc.get('ego_pose', ref_sd_rec['ego_pose_token'])
    ref_cs_rec = nusc.get('calibrated_sensor', ref_sd_rec['calibrated_sensor_token'])
    ref_time = 1e-6 * ref_sd_rec['timestamp']

    # Homogeneous transformation matrix from global to _current_ ego car frame.
    car_from_global = transform_matrix(ref_pose_rec['translation'], Quaternion(ref_pose_rec['rotation']),inverse=True)

A similar thing happens via get_lidar_data(), these points are translated into the egopose relative to the LIDAR_TOP:

    # Get reference pose and timestamp.
    ref_sd_token = sample_rec['data']['LIDAR_TOP']
    ref_sd_rec = nusc.get('sample_data', ref_sd_token)
    ref_pose_rec = nusc.get('ego_pose', ref_sd_rec['ego_pose_token'])
    ref_cs_rec = nusc.get('calibrated_sensor', ref_sd_rec['calibrated_sensor_token'])
    ref_time = 1e-6 * ref_sd_rec['timestamp']

    # Homogeneous transformation matrix from global to _current_ ego car frame.
    car_from_global = transform_matrix(ref_pose_rec['translation'], Quaternion(ref_pose_rec['rotation']),
                                        inverse=True)

Then these points are translated to CAM_FRONT coordinates in train_nuscenes.py via these transformation matrices:

    cams_T_velo = __u(utils.geom.safe_inverse(__p(velo_T_cams)))

And finally translated:

    xyz_cam0 = utils.geom.apply_4x4(cams_T_velo[:,0], xyz_velo0)
    # apply transformation of radar to camera coords (1st camera coords [:,0])
    rad_xyz_cam0 = utils.geom.apply_4x4(cams_T_velo[:,0], xyz_rad)

I do not understand how this is correct. How would one transformation matrix correctly transform both RADAR and LIDAR into the CAM_FRONT frame if both RADAR and LIDAR are stored in different coordinate systems (egopose relative to their sensors?). I am sure I am missing something/ have a misunderstaning and would someone be able to point me in the right direction? Thanks

seamie6 commented 3 months ago

Misunderstanding of code