How do I get the frame under camera_link?

ydzat commented 1 year ago

I asked this question over at realsense-ros earlier. It was finally confirmed that my problem was not with the camera driver.

My question: I use the realsense D455 camera to publish topics and receive them using rtabmap with a python script to save the data in .bag. Then reconstruct the point cloud .ply from the depth and tf data, then import the .ply into Blender, then create a virtual camera in Blender and simulate the motion of the real camera based on the tf information extracted from the .bag, and save the rendered image.

In theory, the resulting rendered image should match the real rgb image (e.g. an rgb image with an object in it should also have that object in the corresponding rendered image, and the object should be in the same position)

However, the current situation is that the rendered image does not match the rgb image.

Screenshot_20230709_161957

As you can see in the image, for example, the bottles are not at the same height. There is another image in the previous link, and in some cases there is some offset in the position of some of the objects.

When I run rtabmap I can see an axis in the 3D Map section, after comparing it I found that this axis is probably camera_link and is significantly lower in view height than the rgb image displayed in the window next to it. So I guess that the tf recorded by rtabmap is for camera_link and not camera_color_optical_frame below camera_link (the tf tree is below). And this might result in the tf of the camera_color_optical_frame recorded in the .bag being overwritten by the camera_link one? tf_tree

this is the tf tree of .bag Screenshot_20230707_084644

I don't know much about rtabmap and I haven't found a way where I can modify the subscription frame (like the camera_color_optical_frame I want).

Here is my launch file:

<?xml version="1.0"?>
<launch>
    <include file="$(find rtabmap_ros)/launch/rtabmap.launch">
        <arg name="rtabmap_args"        value= "--delete_db_on_start --Rtabmap/DetectionRate 2  RGBD/LinearUpdate 0.01  RGBD/AngularUpdate 0.01" />
        <arg name="use_sim_time"        value="false"/>
        <arg name="rtabmapviz"          value="true" />
        <arg name="rviz"                value="false" />

        <arg name="depth_topic"         value="/camera/aligned_depth_to_color/image_raw" />
        <arg name="rgb_topic"           value="/camera/color/image_raw" />
        <arg name="camera_info_topic"   value="/camera/color/camera_info" />

        <arg name="depth_camera_info_topic" value="/camera/aligned_depth_to_color/camera_info" />

        <arg name="approx_sync"         value="true" />
        <arg name="wait_imu_to_init"    value="true"  />
        <arg name="imu_topic"           value="/rtabmap/imu" />

        <arg name="frame_id"            value="camera_link"/>
        <arg name="map_frame_id"        value="map"/>
    </include>
</launch>

I'm guessing that the offset of the image is a problem with the tf, how do I get the correct frame and its tf?. And is there any way to make the view height in the 3D Map displayed in rtabmap consistent with the rgb?

matlabbe commented 1 year ago

In your tf tree, the base frame is camera_link. You can check if there is an offset between camera_link and camera_color_optical_frame with:

rosrun tf tf_echo camera_link camera_color_optical_frame

Normally there should be 90 deg rotation in yaw and roll at least.

map -> camera_color_optical_frame looks wrong to me, if frame_id was indeed camera_link for that test.

Which camera info are you using to generate the rendered image? If you use TF in blender, the pose map->camera_color_optical_frame should be the pose of the optical frame of the color camera in the map (the rgb camera pose). At exactly the same stamp and with same calibration data, normally the rendered frame would match the real one.

ydzat commented 1 year ago

map -> camera_color_optical_frame looks wrong to me

frame_id is indeed camera_link, if it is wrong, what should I use? Or what should I do?

Which camera info are you using to generate the rendered image?

camera_info_topic, the topic is /camera/color/camera_info

... normally the rendered frame would match the real one

That's what I thought, but there's just a degree of offset, and I haven't been able to fix it so far ......

matlabbe commented 1 year ago

How map->camera_color_optical_frame was recorded. With frame_id set to camera_link, you should see instead a TF tree like this:

map
 |
 -> odom
    |
    -> camera_link
        |
        -> camera_aligned_depth_to_color_frame
        |   |
        |   -> camera_color_optical_frame
        |
        -> other frames...

Using camera_link is fine.

Returning to your original question, which frames in tf are used when you say

simulate the motion of the real camera based on the tf information extracted from the .bag

? Make sure you use map to camera_color_optical_frame. Other thing to check, do you observe exactly the same offset for all rendered frames, or does it vary? It it varies, it could be related to errors/drift in the map, that make object looking not exactly at the same position (one way to confirm this is just not move the camera, so motion is expected to be identity and check if there is an offset).

ydzat commented 1 year ago

With frame_id set to camera_link, the TF tree looks slightly different from what you drew, it looks like this:

camera_link
        |
        -> camera_aligned_depth_to_color_frame
        |   |
        |   -> camera_color_optical_frame
        |
        -> other frames...

There's no map or odom in here, as the very first image shows. But if I change <param name="_publish_tf" type="bool" value="false" /> to <param name="_publish_tf" type="bool" value="true" /> in the camera driver's launch file, then the TF tree would be:

map
   |
   -> odom  
        |
        -> camera_imu_optical_frame
        |   
        -> camera_link
                  |
                  -> camera_aligned_depth_to_color_frame
                  |    |
                  |    -> camera_color_optical_frame
                  |
                  -> other frames...

Other thing to check, do you observe exactly the same offset for all rendered frames, or does it vary?

it does vary. So my first guess was that it was an "distortion" problem, but when I corrected the distortion of one image A according to camera_info, it was fine for image A, but when I corrected another image B with the same parameters (let's say an image acquired from another angle), the offset was even worse.

So I'm confused, is there a problem with the position of the camera_colour_optical_frame, or, is it not correctly going from map to camera_colour_optical_frame when publish_tf is false?

Here is my launch file, rewritten from realsense2_camera's opensource_tracking.launch:

<launch>
    <arg name="offline"          default="false"/>
    <include unless="$(arg offline)" 
        file="$(find realsense2_camera)/launch/rs_camera.launch">
        <arg name="align_depth" value="true"/>
        <arg name="linear_accel_cov" value="1.0"/>
        <arg name="unite_imu_method" value="linear_interpolation"/>
        <arg name="clip_distance" value="4.0"/>

        <arg name="enable_gyro" value="true"/>
        <arg name="enable_accel" value="true"/>
        <arg name="enable_sync" value="true"/>

        <arg name="depth_width" value="640"/> 
        <arg name="depth_height" value="480"/>
        <arg name="depth_fps" value="30"/>

        <arg name="color_width" value="640"/>
        <arg name="color_height" value="480"/>
        <arg name="color_fps" value="30"/> 
    </include>

    <node pkg="imu_filter_madgwick" type="imu_filter_node" name="ImuFilter">
        <param name="use_mag" type="bool" value="false" />
        <param name="_publish_tf" type="bool" value="true" />
        <param name="_world_frame" type="string" value="enu" />
        <remap from="/imu/data_raw" to="/camera/imu"/>
        <!-- <remap from="/imu/data" to="/rtabmap/imu" />     is this needed?--> 
    </node>

    <include file="$(find rtabmap_ros)/launch/rtabmap.launch">

        <arg name="args"        value= "--delete_db_on_start --Rtabmap/DetectionRate 2  RGBD/LinearUpdate 0.01  RGBD/AngularUpdate 0.01" />

        <arg name="rgb_topic" value="/camera/color/image_raw"/>
        <arg name="depth_topic" value="/camera/aligned_depth_to_color/image_raw"/>
        <arg name="camera_info_topic" value="/camera/color/camera_info"/>
        <arg name="depth_camera_info_topic" value="/camera/depth/camera_info"/>
        <arg name="rtabmapviz" value="false"/>
        <arg name="rviz" value="true"/>

        <arg name="use_sim_time"        value="false"/>

    </include>

    <include file="$(find robot_localization)/launch/ukf_template.launch"/>
    <param name="/ukf_se/frequency" value="300"/>
    <param name="/ukf_se/base_link_frame" value="camera_link"/>
    <param name="/ukf_se/odom0" value="rtabmap/odom"/>
    <rosparam param="/ukf_se/odom0_config">[true,true,true,
                                            true,true,true,
                                            true,true,true,
                                            true,true,true,
                                            true,true,true]
    </rosparam>
    <param name="/ukf_se/odom0_relative" value="true"/>
    <param name="/ukf_se/odom0_pose_rejection_threshold" value="10000000"/>
    <param name="/ukf_se/odom0_twist_rejection_threshold" value="10000000"/>

    <param name="/ukf_se/imu0" value="/imu/data"/>
    <rosparam param="/ukf_se/imu0_config">[false, false, false,
                                           true,  true,  true,
                                           true,  true,  true,
                                           true,  true,  true,
                                           true,  true,  true]
    </rosparam>
    <param name="/ukf_se/imu0_differential" value="true"/>
    <param name="/ukf_se/imu0_relative" value="false"/>
    <param name="/ukf_se/use_control" value="false"/>
    <!-- <param name="/ukf_se/odom0_config" value="{true,true,true,}"/> -->
</launch>

matlabbe commented 1 year ago

Could you remove robot_localization from the equation? to limit possible confusion with published TF frames. Because robot_localization localization would also publish odom or map frame at the same time than rtabmap, which will create issues on TF afterwards.

For IMU filter, publish_tf (not _publish_tf) should be false, otherwise a odom frame will be published to imu frame, which will mess up the Tf tree.

The launch file could look like this:

<launch>
    <include file="$(find realsense2_camera)/launch/rs_camera.launch">
        <arg name="align_depth" value="true"/>
        <arg name="linear_accel_cov" value="1.0"/>
        <arg name="unite_imu_method" value="linear_interpolation"/>
        <arg name="clip_distance" value="0.0"/>

        <arg name="enable_gyro" value="true"/>
        <arg name="enable_accel" value="true"/>
        <arg name="enable_sync" value="true"/>

        <arg name="depth_width" value="640"/> 
        <arg name="depth_height" value="480"/>
        <arg name="depth_fps" value="30"/>

        <arg name="color_width" value="640"/>
        <arg name="color_height" value="480"/>
        <arg name="color_fps" value="30"/> 
    </include>

    <node pkg="imu_filter_madgwick" type="imu_filter_node" name="ImuFilter">
        <param name="use_mag" type="bool" value="false" />
        <param name="publish_tf" type="bool" value="false" />
        <param name="world_frame" type="string" value="enu" />
        <remap from="/imu/data_raw" to="/camera/imu"/>
    </node>

    <include file="$(find rtabmap_ros)/launch/rtabmap.launch">

        <arg name="args"        value= "--delete_db_on_start --Rtabmap/DetectionRate 2  RGBD/LinearUpdate 0.01  RGBD/AngularUpdate 0.01" />

        <arg name="rgb_topic" value="/camera/color/image_raw"/>
        <arg name="depth_topic" value="/camera/aligned_depth_to_color/image_raw"/>
        <arg name="camera_info_topic" value="/camera/color/camera_info"/>
        <arg name="depth_camera_info_topic" value="/camera/depth/camera_info"/>
        <arg name="rtabmapviz" value="false"/>
        <arg name="rviz" value="true"/>
        <arg name="approx_sync" value="false"/>
        <arg name="frame_id" value="camera_link"/>
        <arg name="wait_imu_to_init" value="true"/>
        <arg name="imu_topic" value="/imu/data"/>

    </include>
</launch>

ydzat commented 1 year ago

It makes sense. But the results haven't changed...

matlabbe commented 1 year ago

In blender, in which coordinates are you working? The TF pose would follow ROS convention, x->forward, y->left, z->up, while in blender it may be opengl coordinate frames: x->right, y->down and z->forward. Are rotations correctly converted?

EDIT: can you share the database and your generated mesh (ply) you are using in blender?

ydzat commented 1 year ago

In blender, in which coordinates are you working? The TF pose would follow ROS convention, x->forward, y->left, z->up, while in blender it may be opengl coordinate frames: x->right, y->down and z->forward. Are rotations correctly converted?

It has been converted correctly.

ydzat commented 1 year ago

EDIT: can you share the database and your generated mesh (ply) you are using in blender?

Sorry, I can't provide this data, I can only provide a presentation of the results and some of the code.

Regarding the extracted tf

    for topic, msg, t in bag.read_messages(topics=['/tf', '/rec/rgb']):
        if topic == '/rec/rgb':
            image_time_name.append((t, msg.header.seq))
        if topic == '/tf':
            for transform in msg.transforms:
                if transform.child_frame_id == 'camera_color_optical_frame':
                    pose = transform.transform

                    translation = (pose.translation.x, pose.translation.y, pose.translation.z)                    
                    rotation = [pose.rotation.x, pose.rotation.y, pose.rotation.z, pose.rotation.w]

                    camera_poses.append((t, [translation, rotation]))

Use of tf in Blender:

        x, y, z, qw, qz, qy, qx = output_list

        camera.location = Vector((x, y, z))
        camera.rotation_mode = 'QUATERNION'
        camera.rotation_quaternion = Quaternion((qw, -qx, -qy, qz))

matlabbe commented 1 year ago

You cannot extract the pose like this. The transform with camera_color_optical_frame as child would be always camera_aligned_depth_to_color_frame from the tf tree you have show here. The "pose" will be always the same.

To get actual pose of the camera in map frame at the time of your image is more complicated than that as you may need some interpolation to do (it is unlikely there if tf messages along the whole tree with exactly the same stamp than the image frame). I think the easiest way would be while you record the bag, you have a node subscribing to image topic, in the callback do a lookup transform between map and the frame in header of image topic at the time of the stamp in the header of the topic, then republish that pose in a geometry_msgs/PoseStamped topic (e.g. /pose) that you record in the bag. Your script reading the bag would be more like:

if topic == '/pose':         
    translation = (pose.pose.position.x, pose.pose.position.translation.y, pose.pose.position.translation.z)                    
    rotation = [pose.pose.orientation.x, pose.pose.orientation.y, pose.pose.orientation.z, pose.pose.orientation.w]

    camera_poses.append((pose.header.stamp, [translation, rotation]))

ydzat commented 1 year ago

You cannot extract the pose like this. The transform with camera_color_optical_frame as child would be always camera_aligned_depth_to_color_frame from the tf tree you have show here. The "pose" will be always the same.

To get actual pose of the camera in map frame at the time of your image is more complicated than that as you may need some interpolation to do (it is unlikely there if tf messages along the whole tree with exactly the same stamp than the image frame). I think the easiest way would be while you record the bag, you have a node subscribing to image topic, in the callback do a lookup transform between map and the frame in header of image topic at the time of the stamp in the header of the topic, then republish that pose in a geometry_msgs/PoseStamped topic (e.g. /pose) that you record in the bag. Your script reading the bag would be more like:
if topic == '/pose':         
    translation = (pose.pose.position.x, pose.pose.position.translation.y, pose.pose.position.translation.z)                    
    rotation = [pose.pose.orientation.x, pose.pose.orientation.y, pose.pose.orientation.z, pose.pose.orientation.w]

    camera_poses.append((pose.header.stamp, [translation, rotation]))

I just recorded a small dataset, you can download it here.

The tf tree in the bag file contains only 2 different frames: map and camera_color_optical_frame

That's why I can't use the lookup function.

matlabbe commented 1 year ago

Who is publishing that map -> camera_color_optical_frame ?

Note also that you need to record /tf_static too in the bag.

ydzat commented 1 year ago

Who is publishing that map -> camera_color_optical_frame ?

Note also that you need to record /tf_static too in the bag.

I believe this map is from rtabmap. Regarding recording tf_static, I'll try it next, is there anything I should be aware of for extracting tf after recording tf_static?

matlabbe commented 1 year ago

rtabmap would publish map->odom, and visual odometry would publish odom -> camera_color_optical_frame only if frame_id for rtabmap is set to frame_id. I guess there is something else publishing map as we don't see odom either, so rtabmap may not even be running.

ydzat commented 1 year ago

rtabmap would publish map->odom, and visual odometry would publish odom -> camera_color_optical_frame only if frame_id for rtabmap is set to frame_id. I guess there is something else publishing map as we don't see odom either, so rtabmap may not even be running.

It's possible. I need to look into it.

introlab / rtabmap_ros

How do I get the frame under camera_link? #993