luxonis / depthai-ros

Official ROS Driver for DepthAI Sensors.
MIT License
256 stars 186 forks source link

[BUG] Detection output on mono camera looks not to scale #604

Open MRo47 opened 2 weeks ago

MRo47 commented 2 weeks ago

Version: 1 commit behind iron commit hash cf5d2aaee9117298ea1632c98ffd36a5d7d535ac

Issue

Steps to reproduce

  1. Set config (camera.yaml) as below

    /**:
    ros__parameters:
    camera:
      i_enable_imu: true
      i_enable_ir: true
      i_nn_type: none
      i_pipeline_type: Depth
    left:
      i_publish_topic: true
      i_enable_nn: true
      i_disable_node: false
      i_resolution: '720P'
    left_nn:
      i_board_socket_id: 1
      i_nn_config_path: depthai_ros_driver/yolo
  2. Modify launch file to add the depthai_filters::Detection2DOverlay and detection_labels, these have to be set up for the overlay as well as the default are mobilenet ssd labels.

detection_labels = [
        "person",
        "bicycle",
        "car",
        "motorbike",
        "aeroplane",
        "bus",
        "train",
        "truck",
        "boat",
        "traffic light",
        "fire hydrant",
        "stop sign",
        "parking meter",
        "bench",
        "bird",
        "cat",
        "dog",
        "horse",
        "sheep",
        "cow",
        "elephant",
        "bear",
        "zebra",
        "giraffe",
        "backpack",
        "umbrella",
        "handbag",
        "tie",
        "suitcase",
        "frisbee",
        "skis",
        "snowboard",
        "sports ball",
        "kite",
        "baseball bat",
        "baseball glove",
        "skateboard",
        "surfboard",
        "tennis racket",
        "bottle",
        "wine glass",
        "cup",
        "fork",
        "knife",
        "spoon",
        "bowl",
        "banana",
        "apple",
        "sandwich",
        "orange",
        "broccoli",
        "carrot",
        "hot dog",
        "pizza",
        "donut",
        "cake",
        "chair",
        "sofa",
        "pottedplant",
        "bed",
        "diningtable",
        "toilet",
        "tvmonitor",
        "laptop",
        "mouse",
        "remote",
        "keyboard",
        "cell phone",
        "microwave",
        "oven",
        "toaster",
        "sink",
        "refrigerator",
        "book",
        "clock",
        "vase",
        "scissors",
        "teddy bear",
        "hair drier",
        "toothbrush",
    ]

    detection_viz_node = ComposableNode(
        package="depthai_filters",
        plugin="depthai_filters::Detection2DOverlay",
        parameters=[
            {"label_map": detection_labels},
        ],
        remappings=[
            ("rgb/preview/image_raw", "/oak/left/image_raw"),
            ("nn/detections", "/oak/left_nn/detections"),
        ],
    )

    ...

    ComposableNodeContainer(
            name=f"{name}_container",
            namespace=namespace,
            package="rclcpp_components",
            executable="component_container",
            composable_node_descriptions=[
                ComposableNode(
                    package="depthai_ros_driver",
                    plugin="depthai_ros_driver::Camera",
                    name=name,
                    namespace=namespace,
                    parameters=[
                        params_file,
                        tf_params,
                        parameter_overrides,
                        {"left_nn.i_label_map": detection_labels}
                    ],
                ),
                detection_viz_node,
            ],
            arguments=["--ros-args", "--log-level", log_level],
            prefix=[launch_prefix],
            output="both",
        ),
  1. Build and launch
    ros2 launch depthai_ros_driver camera.launch.py

Below is how the overlay looks, the person doesn't look like this on camera, trust me and we don't have ghosts ;)

Screenshot from 2024-09-26 14-59-25

MRo47 commented 2 weeks ago

scaling_issue.zip

Complete launch file and yaml config

Serafadam commented 2 weeks ago

Hi, thanks for the report, indeed there is a minor bug here, fix will be on the way but just to recap:

MRo47 commented 2 weeks ago

@Serafadam Thank you for the pointers above.

Yup that was my guess that it was the scaling done my manip node before input to the NN and I can confirm that on setting left_nn.i_enable_passthrough: true and doing the overlay on the output topic the output is correctly aligned.

From a design perspective I have 3 questions

  1. Shouldn't desqueeze be performed on the Depth node directly? My reasoning is that the user gave the model input of size WxH so they should expect the detections to be of same size right?
  2. Would the above complicate the implementation somehow?
  3. To perform desqueeze on DetectionOverlay2D filter node you will need to have information on the "squeeze ratios" meaning input_image_size:NN_input_size, are these captured somewhere in a topic? or would this require passing the NN configs to the DetectionOverlay2D filter?
Serafadam commented 2 weeks ago

An implementation of this dequeeze is available on this branch to test out. A small modification to the DetectionOverlayNode in launch file is needed for it:

remappings=[ 
("rgb/preview/image_raw", "/oak/left/image_raw"),
("rgb/preview/camera_info", "/oak/left_nn/camera_info"), 
("nn/detections", "/oak/left_nn/detections"), 
 ], 

Shouldn't desqueeze be performed on the Depth node directly? My reasoning is that the user gave the model input of size WxH so they should expect the detections to be of same size right?

At this moment this is not implemented in the API, this is the default behavior in both C++ and Python API as described in the documentation. This might change in the future.

To perform desqueeze on DetectionOverlay2D filter node you will need to have information on the "squeeze ratios" meaning input_image_size:NN_input_size, are these captured somewhere in a topic? or would this require passing the NN configs to the DetectionOverlay2D filter?

Currently it is done based on the camera_info topic taken from passthrough image, unfortunately ROS Vision message doesn't carry this information

MRo47 commented 1 week ago

Hey @Serafadam

I was able to have a workaround for this in #606