How to get raw transformation matrix between different sensor modalities?

seamie6 commented 6 months ago

Translation between different sensor modalities between frames is doable. For example, CAM_FRONT frame 0 to RADAR_FRONT frame 1 involves the transformations:

CAM_FRONT F0 -> GLOBAL COORDS -> RADAR_FRONT F1

It is possible to do so using the egopose records as done in get_radar_data() function from nuscenesdataset.py, this snippet shows the car_from_global transformation matrix which describes the transformation from GLOBAL to the RADAR FRONT ego frame:


    # Get reference pose and timestamp.
    ref_sd_token = sample_rec['data']['RADAR_FRONT']
    ref_sd_rec = nusc.get('sample_data', ref_sd_token)
    ref_pose_rec = nusc.get('ego_pose', ref_sd_rec['ego_pose_token'])
    ref_cs_rec = nusc.get('calibrated_sensor', ref_sd_rec['calibrated_sensor_token'])
    ref_time = 1e-6 * ref_sd_rec['timestamp']

    # Homogeneous transformation matrix from global to _current_ ego car frame.
    car_from_global = transform_matrix(ref_pose_rec['translation'], Quaternion(ref_pose_rec['rotation']),inverse=True)

What I am looking for is a 'raw' transformation matrix which describes the transformation between different sensor modalities when the vehicle is stationary and not moving (CAM_FRONT to RADAR_FRONT for example). This is also the case for when the sensor modalities timestamp is EXACTLY the same, as this would mean the ego vehicle has the same global coordinates for both modalities. A case for this is when ref_sd_token = sample_rec['data']['RADAR_FRONT'] is not available.

As i stated, this raw transformation matrix be extracted when the timestamps between CAM_FRONT and RADAR_FRONT is 0. However, I could not find a case for when this occurs as radar and camera are asynchronous so it would be rare for this to occur. The closest I could get is 4 microseconds but I would like the exact raw transformation matrix if this is possible?

Thank you.

seamie6 commented 6 months ago

I think I may have gotten it:

# Specify which scene, sample, and instance you want to work with
scene = nusc.scene[6]
sample_token = scene['first_sample_token'] # get token for this scene
my_sample = nusc.get('sample', sample_token) # get first sample
#my_sample = nusc.get('sample', my_sample['next'])

rad_tok = my_sample['data']['RADAR_FRONT']
rad_data = nusc.get('sample_data', rad_tok)
rad_sensor = nusc.get('calibrated_sensor', rad_data['calibrated_sensor_token'])

cam_tok = my_sample['data']['CAM_FRONT']
cam_data = nusc.get('sample_data', cam_tok)
cam_sensor = nusc.get('calibrated_sensor', cam_data['calibrated_sensor_token'])

car_from_radar = transform_matrix(rad_sensor['translation'], Quaternion(rad_sensor['rotation']),inverse=False)
cam_from_car = transform_matrix(cam_sensor['translation'], Quaternion(cam_sensor['rotation']),inverse=True)

cam_from_radar =  reduce(np.dot, [cam_from_car, car_from_radar])

cam_from_radar is constant for all key frames. Does this correctly describe the transformation from radar to camera?

whyekit-motional commented 6 months ago

@seamie6 your code looks reasonable to me - you can also try rendering the transformed radar points with the corresponding camera image to check

seamie6 commented 6 months ago

I think I am making unesscary work for myself. Could you verify this is correct:

    # Get reference pose and timestamp.
    ref_sd_token = sample_rec['data']['RADAR_FRONT']
    ref_sd_rec = nusc.get('sample_data', ref_sd_token)
    ref_pose_rec = nusc.get('ego_pose', ref_sd_rec['ego_pose_token'])
    ref_cs_rec = nusc.get('calibrated_sensor', ref_sd_rec['calibrated_sensor_token'])

This code gets the ego pose at the timestamp of when RADAR_FRONT was captured. It represents the transformation from the ego_vehicle coordinates at that timestamp into GLOBAL coordinates In that same logic, if it were replaced with LIDAR_TOP, this would correspond to the ego pose of the timestamp LIDAR_TOP was captured at? So if this is correct, there is no need for me to get this 'raw' transformation matrix, as if I am transforming radar coordinates from one frame to another, I would only need transform the old RADAR coordinates into the new frames ego pose say for CAM_FRONT or something like that. There is no need for transformation between 'sensor coordinates' as the RADAR coordinates are in the ego vehicle frame of the timestamp for when RADAR_FRONT was captured. END

Also a note on the code I submitted for cam_from_radar. This is not constant for all scenes due to the fact 2 different ego vehicles we used for the capturing of nuScenes and so positioning of sensors would be different on these two vehicles. However, it is constant for all frames in a given scene

whyekit-motional commented 6 months ago

@seamie6 yes, the code which you showed here gets the pose of the ego wrt the global frame

This line gets the pose of the ego wrt the global frame:

ref_pose_rec = nusc.get('ego_pose', ref_sd_rec['ego_pose_token'])

And this line gets the pose of a particular sensor wrt the ego frame:

ref_cs_rec = nusc.get('calibrated_sensor', ref_sd_rec['calibrated_sensor_token'])

With these two pieces of information, you should have all the required information to construct a transformation matrix between any two sensors you like

seamie6 commented 6 months ago

My motivation behind transformation matrices was driven by the incorrect assumption that this code:

    ref_sd_token = sample_rec['data']['RADAR_FRONT']
    ref_sd_rec = nusc.get('sample_data', ref_sd_token)
    ref_pose_rec = nusc.get('ego_pose', ref_sd_rec['ego_pose_token'])

has to do with sensor modality coordinates, which it does not. This makes what I was trying to achieve actually much more simple than I thought. Although transformation between sensor modality coordinate systems is possible as I did with my cam_from_radar code.

Thank you for clearing up my confusion, much appreciated!

nutonomy / nuscenes-devkit

How to get raw transformation matrix between different sensor modalities? #1051