nutonomy / nuscenes-devkit

The devkit of the nuScenes dataset.
https://www.nuScenes.org
Other
2.2k stars 617 forks source link

Converting a point in camera to global reference frame #953

Closed rttus closed 11 months ago

rttus commented 12 months ago

Hello,

I have a point [-5.0, 0, -5.0] in camera reference frame which is RDF(x-right, y-down, z-front) ego as origin and i am trying to map this to global reference frame which is RFU(x-right, y-front, z-up mostly zero) bottom left corner as origin. I use the sample code like below

# RDF point.
box = Box([-5.0, 0.0, -5.0], [1.0, 1.0, 1.0], Quaternion(axis=[0, 1, 0], angle=0))

# First step: transform from cam to ego.
cs_rec_cam = self.nusc.get('calibrated_sensor', self.nusc.get('sample_data', sample['data']['CAM_BACK_LEFT'])['calibrated_sensor_token'])
box.translate(np.array(cs_rec_cam['translation']))
box.rotate(Quaternion(cs_rec_cam['rotation']))

# Second step: transform from ego to global.
egopose = self.nusc.get('ego_pose', self.nusc.get('sample_data', rec['data']['CAM_BACK_LEFT'])['ego_pose_token'])
box.translate(np.array(egopose['translation']))
box.rotate(Quaternion(egopose['rotation']))

After these transforms i expect the box center to be in global reference frame with center at bottom left corner but doesnt seem to align with that frame. I am validating this against one of the existing similar pedestrian annotation which is located at [633.403, 1639.276, 1.131] which doesnt match with box center which is [1521.611, 840.40, -10.14]. The x and y seem switched and z axis flipped. Wondering if this is the right way to apply transformation. Can you please clarify?

whyekit-motional commented 12 months ago

@rttus you might want to try doing the rotation first followed by the translation in both the first step and second step (note that rotation and translation do not commute, i.e. the order matters)

You could use this part of the devkit's code as a reference: https://github.com/nutonomy/nuscenes-devkit/blob/a5c089133baa001d3ab3c5583a103957e4ae8375/python-sdk/nuscenes/nuscenes.py#L881-L890

rttus commented 12 months ago

Ah! thank you so much for the reference code, that really slipped out of my mind. I haven't got a chance to try it yet. I will try it.

rttus commented 12 months ago

@whyekit-motional I changed the order of the translation rotation to rotation translation. However, i still see the z axis in negative direction. Here is how the ground truth values for translation look [637.141, 1636.252, -0.235] where as the similar point transformed looks [599.3276386059546, 1653.5647261002628, 1.5953615160665837].

whyekit-motional commented 11 months ago

@rttus if you can show your updated code snippet (best if it's a working code snippet, which we can simply copy-and-paste and run it locally :smile:), we can see what else could be going wrong

rttus commented 11 months ago

Sorry, here is the code

# RDF point.
box = Box([-5.0, 0.0, -5.0], [1.0, 1.0, 1.0], Quaternion(axis=[0, 1, 0], angle=0))

# First step: transform from cam to ego.
cs_rec_cam = self.nusc.get('calibrated_sensor', self.nusc.get('sample_data', sample['data']['CAM_BACK_LEFT'])['calibrated_sensor_token'])
box.rotate(Quaternion(cs_rec_cam['rotation']))
box.translate(np.array(cs_rec_cam['translation']))

# Second step: transform from ego to global.
egopose = self.nusc.get('ego_pose', self.nusc.get('sample_data', rec['data']['CAM_BACK_LEFT'])['ego_pose_token'])
box.rotate(Quaternion(egopose['rotation']))
box.translate(np.array(egopose['translation']))
whyekit-motional commented 11 months ago

@rttus based on your new code snippet, you are still doing translation followed by rotation :smile: I suggest to try rotation followed by translation instead

rttus commented 11 months ago

I am sorry, i updated the code above.

whyekit-motional commented 11 months ago

@rttus you have initialized box in the camera frame - note that the directions of xyz in the camera frame is different from that in the global frame (pls see here for more details)