[custom] What is the camera referential ?

Hi, I am using the oakd camera to detect some object. After detecting them I calculate the centroids of them. I have tested my code wiht the ZED stereo camera and it works great. However, the same code for the oakd doesnt work. The pose that i obtain have the same coordinates in x,y,z.

Here is the code to calculate centroids

center_x = camera_model.cx() * scale
        center_y = camera_model.cy() * scale
        unit_scaling = DepthTraits.to_meters(1)
        constant_x = unit_scaling / (camera_model.fx() * scale)
        constant_y = unit_scaling / (camera_model.fy() * scale)
        for mask in depth_mask:
            if mask is None:
                continue
            centroid_pose = np.zeros(3).astype(np.float32)
            n_valid_points = 0
            for j in range(depth_image.shape[0]):
                for i in range(depth_image.shape[1]):
                    if mask[j][i] == 0:
                        continue
                    depth = depth_image[j][i]
                    if not DepthTraits.valid(np.array(depth)):
                        continue
                    depth = min(depth, range_max) if range_max > 0 else depth

                    depth_point = np.array([
                        (i - center_x) * depth * constant_x,
                        (j - center_y) * depth * constant_y,
                        DepthTraits.to_meters(depth)])
                    centroid_pose += depth_point[0]
                    n_valid_points += 1                # we have to inverse to camera frame so x->y y->z z->x
                tag_pose.pose.position.x = float(pose[2])
                tag_pose.pose.position.y = -float(pose[0])
                tag_pose.pose.position.z = -float(pose[1])

            if n_valid_points == 0:
                centroid_pose = np.array([np.nan, np.nan, np.nan])
            else:
                centroid_pose /= n_valid_points
            centroids.append(centroid_pose)
        return centroids

I then publish them as tagpose object as follow. I consider x as y and y as z and z as x. This still doesnt explain why i would have x=y=z everytime.

  # we have to inverse to camera frame so x->y y->z z->x
  tag_pose.pose.position.x = float(pose[2])
  tag_pose.pose.position.y = -float(pose[0])
  tag_pose.pose.position.z = -float(pose[1])

thank you

luxonis / depthai-ros

[custom] What is the camera referential ? #243