mcordts / cityscapesScripts

README and scripts for the Cityscapes Dataset
MIT License
2.19k stars 608 forks source link

Question about the definition of yaw, pitch and yaw in 3D bbox labels #158

Closed prismformore closed 2 years ago

prismformore commented 2 years ago

Hi, may I know where could I find a clear definition of yaw, pitch, and yaw angles used in this 3d dataset? For example, the coordinate system used and the value range of these angles. I couldn't find them in the paper and the evaluation code. Thank you very much.

It would be better if we can also have a conversion tutorial between the actually used quaternion labels in data and those three Euler angels, although we can directly use the tools in evaluation code.

ncgaehle commented 2 years ago

Please checkout the following files:

and see All GT annotations are given in the ISO coordinate system V and hence, the evaluation requires the data to be available in this coordinate system.. The used Euler angles are the ones in coordinate system V.

prismformore commented 2 years ago

@ncgaehle Thank you very much for your reply. The definition of yaw-pitch-roll is important for me because I am considering the transformation between the local yaw-pitch-roll angles and global yaw-pitch-roll angles.

I have read these files before and since there is still no clear definition of the yaw, pitch and roll in them may I confirm whether my understanding is correct:

For rotation_V obtained from box3d_annotation.get_parameters(coordinate_system=CRS_V), we can get yaw, pitch and roll by transforming the quaternion into Eular angles with rotation_V.yaw_pitch_roll. The obtained yaw is defined as the angle between vehicle direction and x-axis (on x-y plane), pitch is the angle between vehicle direction and z-axis (on z-x plane), roll is the angle between vehicle direction and y-axis (on z-y plane).

After we get rotation_S with box3d_annotation.get_parameters(coordinate_system=CRS_S) , we can get yaw, pitch and roll by transforming the quaternion into Eular angles with rotation_S.yaw_pitch_roll.

Here (under CRS_S), the obtained yaw is defined as the angle between vehicle direction and z-axis (under z-x plane), the pitch is the angle between vehicle direction and the negative y-axis (y-z plane), roll is the angle between vehicle direction and negative x-axis (on x-y plane).

If so the definition of this yaw is different from that used in mmdetection3D, right?

Your help is really appreciated!

ncgaehle commented 2 years ago

CRS_V is the coordinate frame in which all calculations (including yaw, pitch, and roll) are done for the evaluation. In CRS_V it is like this: image Please note, that the coordinate frame in this image is different than CRS_V. It's just to visualize what "yaw", "pitch", and "roll" means: in particular "yaw" is the heading while "pitch" is how much the nose is pointing upwards or downwards. Finally, "roll" refers to the inclination to the left or right.

If you do the calculations in another coordinate frame the angles for sure are different.

prismformore commented 2 years ago

@ncgaehle thank you. I understand the concept of these three angles. But I am not sure about the exact definition of them in CRS_V e.g. the definition of yaw in CRS_V can be the angle between the vehicle direction and x-axis (on x-y plane), or the angle between the vehicle direction and negative y-axis (on x-y plane), this should influence how we calculate the gobal-local yaw transformation, right?

image

As a reference, in MMdetection3D the camera coordinate system defines this yaw angle explicitly: image

ncgaehle commented 2 years ago

As described, "yaw" is the rotation around the z axis in CRS V. Yaw = 0 means a vehicle is moving in the same direction as the x axis. "roll" is the rotation around the x axis, and "pitch" around the y axis.