introlab / rtabmap

RTAB-Map library and standalone application
https://introlab.github.io/rtabmap
Other
2.82k stars 788 forks source link

Question about the coordinate system of rtabmap. #851

Closed cdb0y511 closed 2 years ago

cdb0y511 commented 2 years ago

This may be related to #844

Hi, @matlabbe. The coordinate system of rtabmap confused me for a long time. I want to load my own trajectories into the standalone version. But I notice it may use a different coordinate system (orientation or axis may differ). The result with format motion capture is similar to the (https://github.com/introlab/rtabmap/issues/844#issuecomment-1088846881)(disordered). I think the orientation of my coordinate system is camera frame x->right, y->down, z->forward (standard RealSense point coordinate system). I think the camera frame is directly related to the back-projection process (2.5D to 3D, correct me if I was wrong). The back-projection is similar to rtabmap ((https://github.com/introlab/rtabmap/blob/dc58266edaf692dd645c251c1349b9f89a9b1436/corelib/src/util3d.cpp#L1100)), but I notice rtabmap adds a local transform to the back-projected 3D point, i.e., ( 0 0 1 0 -1 0 0 0 0 -1 0 0 0 0 0 1) https://github.com/introlab/rtabmap/blob/dc58266edaf692dd645c251c1349b9f89a9b1436/corelib/src/util3d.cpp#L1113 This rotation shifts the coordinate system, x->-y, y->-z, z->x. In the graph.cpp, it says remove optical rotation, and z pointing front, x left, y down. (I think this is the part transforming the different coordinate system to rtabmap) https://github.com/introlab/rtabmap/blob/f140e99881a6fd0b6c81d5a29a13fceee28cf2f1/corelib/src/Graph.cpp#L457 https://github.com/introlab/rtabmap/blob/f140e99881a6fd0b6c81d5a29a13fceee28cf2f1/corelib/src/Graph.cpp#L481 I can understand this part Transform t( 0, 0, 1, 0, -1, 0, 0, 0, 0,-1, 0, 0); pose = t pose t.inverse(); But why another rotation? t = Transform( 0, 0, 1, 0, 0, -1, 0, 0, 1, 0, 0, 0); pose = t*pose; In all, how to transform the trajectories to rtabmap coordinate system? Thanks,

cdb0y511 commented 2 years ago

I think my trajectory is camera frame x->right, y->down, z->forward, because I utilize depth and infrared to do 3D-to-2D bundle optimization (minimize reproject error). The back-projection process is similar to https://github.com/introlab/rtabmap/blob/dc58266edaf692dd645c251c1349b9f89a9b1436/corelib/src/util3d.cpp#L1100 . So I think the default coordinate of rtabmap is also x->right, y->down, z->forward, until adding local transform.

cdb0y511 commented 2 years ago

I think I get part of it. I am not familiar with the ROS and the coordinate system. I think the typical coordinate system is https://www.ros.org/reps/rep-0105.html ,which is the z->up,x-front,y->right. But I still get lost in the conversion. I see a example of T265 to rtabmap https://github.com/introlab/rtabmap/blob/ddd5eb5a41d08784db4211ffec6f5eb88e4afac8/corelib/src/camera/CameraRealSense2.cpp#L176 But I lost in the next step ,another conversion. https://github.com/introlab/rtabmap/blob/ddd5eb5a41d08784db4211ffec6f5eb88e4afac8/corelib/src/camera/CameraRealSense2.cpp#L1027. It seems the odometry from t265 is transformed to z->up,x-front,y->right first, and still need remove optical rotation ( 0, 0, 1, 0, -1, 0, 0, 0, 0,-1, 0, 0). https://github.com/introlab/rtabmap/blob/ddd5eb5a41d08784db4211ffec6f5eb88e4afac8/corelib/src/camera/CameraRealSense2.cpp#L1463

cdb0y511 commented 2 years ago

How to feed poses to rtabmap anyway?

matlabbe commented 2 years ago

Rtabmap assumes the odometry pose to be in base frame of the robot, called base_link (following convention of ROS https://www.ros.org/reps/rep-0105.html), which is x->forward, y->left and z->up.

The local transform is the transform between base_link and camera_optical frame. With the standalone, we assume that the position (x,y,z) of base_link is the same than camera_optical, so only an optical rotation would be included in local transform to transform from image frame (x->right, y->down, z->forward) to base frame (x->forward, y->left and z->up). See this REP for _optical suffixes on frames. On real robot setup, there could be another rotation and translation included in the local transform (the full TF transform between base_link and camera_optical frame)

Realsense odometry is computed in camera_optical frame, we transform it in base frame, local transform is:

 0, 0, 1, 0,
-1, 0, 0, 0,
 0,-1, 0, 0

z becomes x, -x becomes y, -y becomes z.

If P_cam is the odometry pose in camera frame and R the local transform (optical rotation), to get P_base in base frame: P_base = R * P_cam * R^-1

In the realsense ros package, they do the same transform to convert their odometry from camera frame to ROS base frame.

Don't look too much inside Graph.cpp file, the dataset formats are often not standard ROS.

cdb0y511 commented 2 years ago

Thanks for your reply. I figure out what I am missing. Depth scale factor should be 4.0 for L515 input and the format is RGB-D motion capture. I appreciate your thorough explanation.