mgonzs13 / yolo_ros

Ultralytics YOLOv8, YOLOv9, YOLOv10, YOLOv11 for ROS 2
GNU General Public License v3.0
336 stars 89 forks source link

distance to object #40

Closed Timofeibb closed 5 months ago

Timofeibb commented 5 months ago

how to get distance to an object and publish it with object name and it's score? for box_data in results.boxes: hypothesis = { "class_id": int(box_data.cls), "class_name": self.yolo.names[int(box_data.cls)], "score": float(box_data.conf) } for example in this part add distance line

mgonzs13 commented 5 months ago

Hi @Timofeibb, If you want the distance to the detected objects, you can use the 3D version that returns 3D coordinates within a frame. Then you can compute the Euclidian distance. Is this what you want?

Timofeibb commented 5 months ago

almost right it can be done in this way in 3d_node part

        x, y, z = bbox3d.center.position.x, bbox3d.center.position.y, bbox3d.center.position.z
        distance = np.sqrt(x**2 + y**2 + z**2)

but i don't know how to print this data with object frame, like score and class name

mgonzs13 commented 5 months ago

What do you mean by this data with object frame?

Timofeibb commented 5 months ago

okay, your code image result output topic is yolo/dbg_image, it contain class_name, class_id and score, i want to include distance to an object along with these values. I have changed some nodes and get output about distance to each object in terminal (/yolo/detections topic, fig. 1)

屏幕截图 2024-05-22 095701 屏幕截图 2024-05-22 095601

The changes was made in detect_3d_node (calculate distance) and yolov8_node (subscribe and publish to the main detection topic), you can check the second one def image_cb part

def image_cb(self, msg: Image) -> None:
        if self.enable:
            # convert image + predict
            cv_image = self.cv_bridge.imgmsg_to_cv2(msg)
            results = self.yolo.predict(
                source=cv_image,
                verbose=False,
                stream=False,
                conf=self.threshold,
                device=self.device
            )
            results: Results = results[0].cpu()
            if results.boxes:
                hypothesis = self.parse_hypothesis(results)
                boxes = self.parse_boxes(results)
                if results.masks:
                    masks = self.parse_masks(results)
                if results.keypoints:
                    keypoints = self.parse_keypoints(results)
                # create detection msgs
                detections_msg = DetectionArray()
                for i in range(len(results)):
                    aux_msg = Detection()
                    if results.boxes:
                        aux_msg.class_id = hypothesis[i]["class_id"]
                        aux_msg.class_name = hypothesis[i]["class_name"]
                        aux_msg.score = hypothesis[i]["score"]
                        aux_msg.bbox = boxes[i]
                    if results.masks:
                        aux_msg.mask = masks[i]
                    if results.keypoints:
                        aux_msg.keypoints = keypoints[i]

                    if self.detections_msg and i < len(self.detections_msg.detections):
                        aux_msg.distance = self.detections_msg.detections[i].distance
                        self.get_logger().info(f"Distance to object {aux_msg.class_name}: {aux_msg.distance}")
                    detections_msg.detections.append(aux_msg)

                # publish detections
                detections_msg.header = msg.header
                self._pub.publish(detections_msg)
            del results
            del cv_image

    def get_detection_array(self) -> None:
        # Subscribe to DetectionArray messages
        self.create_subscription(
            DetectionArray,
            "/yolo/detections_3d",
            self.detection_cb,
            10 # QoS Profile
        )

    def detection_cb(self, msg: DetectionArray) -> None:
        # Callback to handle DetectionArray messages
        self.detections_msg = msg

but now im stuck here, i dont know how to publish this data with other values

屏幕截图 2024-05-22 095814
Timofeibb commented 5 months ago

Problem solved, I was looking in the wrong place. I should have initially paid attention to debug_node i'v changed 1 part and got 2 output 1

def draw_box(self, cv_image: np.array, detection: Detection, color: Tuple[int]) -> np.array:

        # get detection info
        label = detection.class_name
        score = detection.score
        box_msg: BoundingBox2D = detection.bbox
        track_id = detection.id
        distance = detection.distance

        min_pt = (round(box_msg.center.position.x - box_msg.size.x / 2.0),
                  round(box_msg.center.position.y - box_msg.size.y / 2.0))
        max_pt = (round(box_msg.center.position.x + box_msg.size.x / 2.0),
                  round(box_msg.center.position.y + box_msg.size.y / 2.0))

        # draw box
        cv2.rectangle(cv_image, min_pt, max_pt, color, 2)

        # write text
        label = "{} ({}) ({:.3f}) ({:.3f})".format(label, str(track_id), score, distance)
        pos = (min_pt[0] + 5, min_pt[1] + 25)
        font = cv2.FONT_HERSHEY_SIMPLEX
        cv2.putText(cv_image, label, pos, font,
                    1, color, 1, cv2.LINE_AA)

        return cv_image

2

屏幕截图 2024-05-22 105739

the last value is the distance to an object