Foundation Pose with custom segmentation

JRvilanova commented 2 months ago

Hi all,

Thanks for this magnificent work. I came here trying to speed up Foundation Pose. However, I am having hard issues to complete it as I see RT-DETR is not so good for the mask segmentation process and, hence, the output of the pose is not good enough. In that case, I am using my custom segmentation, publishing as a ROS2 topic and trying to get it.

As Foundation pose used a .png format, with no background, I am publishing this way whether rgb8, where the image is black in the background or 8UC4.

More assumptions: As I found no .onnx model for foundation Pose I converted the .etlt to .onnx. I do not know how it works.

I created my custom package with this launch file:

import os

from ament_index_python.packages import get_package_share_directory
import launch
from launch.actions import DeclareLaunchArgument
from launch.conditions import IfCondition
from launch.substitutions import LaunchConfiguration
from launch_ros.actions import ComposableNodeContainer, Node
from launch_ros.descriptions import ComposableNode

REFINE_MODEL_PATH = '/workspaces/isaac_ros-dev/isaac_ros_assets/models/foundationpose/refine_model.onnx'
REFINE_ENGINE_PATH = '/workspaces/isaac_ros-dev/isaac_ros_assets/models/foundationpose/refine_trt_engine.plan'
SCORE_MODEL_PATH = '/workspaces/isaac_ros-dev/isaac_ros_assets/models/foundationpose/score_model.onnx'
SCORE_ENGINE_PATH = '/workspaces/isaac_ros-dev/isaac_ros_assets/models/foundationpose/score_trt_engine.plan'

def generate_launch_description():
    rviz_config_path = os.path.join(
        get_package_share_directory('isaac_ros_foundationpose'),
        'rviz', 'foundationpose_realsense.rviz')

    launch_args = [
        DeclareLaunchArgument(
            'mesh_file_path',
            default_value='/workspaces/models/textured_mesh.obj',
            description='The absolute file path to the mesh file'),

        DeclareLaunchArgument(
            'texture_path',
            default_value='/workspaces/models/material_0.png',
            description='The absolute file path to the texture map'),

        DeclareLaunchArgument(
            'refine_model_file_path',
            default_value=REFINE_MODEL_PATH,
            description='The absolute file path to the refine model'),

        DeclareLaunchArgument(
            'refine_engine_file_path',
            default_value=REFINE_ENGINE_PATH,
            description='The absolute file path to the refine trt engine'),

        DeclareLaunchArgument(
            'score_model_file_path',
            default_value=SCORE_MODEL_PATH,
            description='The absolute file path to the score model'),

        DeclareLaunchArgument(
            'score_engine_file_path',
            default_value=SCORE_ENGINE_PATH,
            description='The absolute file path to the score trt engine'),

        DeclareLaunchArgument(
            'container_name',
            default_value='foundationpose_container',
            description='Name for ComposableNodeContainer'),
    ]

    input_images_drop_freq = LaunchConfiguration('input_images_drop_freq')
    mesh_file_path = LaunchConfiguration('mesh_file_path')
    texture_path = LaunchConfiguration('texture_path')
    refine_model_file_path = LaunchConfiguration('refine_model_file_path')
    refine_engine_file_path = LaunchConfiguration('refine_engine_file_path')
    score_model_file_path = LaunchConfiguration('score_model_file_path')
    score_engine_file_path = LaunchConfiguration('score_engine_file_path')
    container_name = LaunchConfiguration('container_name')

    foundationpose_node = ComposableNode(
        name='foundationpose_node',
        package='isaac_ros_foundationpose',
        plugin='nvidia::isaac_ros::foundationpose::FoundationPoseNode',
        parameters=[{
            'mesh_file_path': mesh_file_path,
            'texture_path': texture_path,

            'refine_model_file_path': refine_model_file_path,
            'refine_engine_file_path': refine_engine_file_path,
            'refine_input_tensor_names': ['input_tensor1', 'input_tensor2'],
            'refine_input_binding_names': ['input1', 'input2'],
            'refine_output_tensor_names': ['output_tensor1', 'output_tensor2'],
            'refine_output_binding_names': ['output1', 'output2'],

            'score_model_file_path': score_model_file_path,
            'score_engine_file_path': score_engine_file_path,
            'score_input_tensor_names': ['input_tensor1', 'input_tensor2'],
            'score_input_binding_names': ['input1', 'input2'],
            'score_output_tensor_names': ['output_tensor'],
            'score_output_binding_names': ['output1'],
        }],
        remappings=[
            ('pose_estimation/depth_image', '/zed/zed_node/depth/depth_registered'),
            ('pose_estimation/image', '/zed/zed_node/right/image_rect_color'),
            ('pose_estimation/camera_info', '/zed2i/zed_node/color/camera_info'),
            ('pose_estimation/segmentation', 'own_segmentation'),
            ('pose_estimation/output', 'output')])

    rviz_node = Node(
        package='rviz2',
        executable='rviz2',
        name='rviz2',
        arguments=['-d', rviz_config_path],
        condition=IfCondition(launch_rviz))

    foundationpose_container = ComposableNodeContainer(
        name=container_name,
        namespace='',
        package='rclcpp_components',
        executable='component_container_mt',
        composable_node_descriptions=[
            foundationpose_node,
        ],
        output='screen'
    )

    return launch.LaunchDescription(launch_args + [foundationpose_container,
                                                   rviz_node])`

Can you help me?

Thanks in advance!

ammar-n-abbas commented 1 month ago

We have built a custom version of FoundationPose on ROS2 with Conda environment that works with multi-object tracking and end-to-end SAM2-based segmentation

https://github.com/ammar-n-abbas/FoundationPoseROS2

JRvilanova commented 1 month ago

Hey @ammar-n-abbas thanks for your response.

I've tested your code on my custom data and is pretty nice. However, I do not see how your cde could fasten how FPose works right now, which is my current problem.

In case I missunderstand your purpose and you have sped it up, tell me.

Thanks mate!

NVIDIA-ISAAC-ROS / isaac_ros_pose_estimation

Foundation Pose with custom segmentation #48