stereolabs / zed-sdk

⚡️The spatial perception framework for rapidly building smart robots and spaces
https://stereolabs.com
MIT License
824 stars 465 forks source link

Saving rendered 3d bounding box coordinates in the image space (pixel coordinates) from ZED SDK? #457

Open harishkool opened 2 years ago

harishkool commented 2 years ago

Preliminary Checks

Proposal

I am following the example shown in zed-examples/object detection/image viewer at master · stereolabs/zed-examples · GitHub. It uses OpenGL to draw objects on the image. Right now the 3D bounding box coordinates from the ZED SDK are normalized, I think it would be great if ZED provides the feasibility of returning the 3D bounding box coordinates in the pixel space by taking projection matrix and image shape as the input. I am doing like below to get the 3D bounding box coordinates in the image space i.e., pixel coordinates

        bbox = objects.object_list[i].bounding_box
    #     _cam_mat = np.array(_cam, np.float32).reshape(4,4)
        N = 8
        hom_obj_coords = np.c_[bbox, np.ones(N)]
        proj3D_cam = np.matmul(hom_obj_coords, _cam_mat) # 8 x 4
        # proj3D_cam[1] = proj3D_cam[1] + 0.25

        # proj2D = [((proj3D_cam[0] / pt4d[3]) * _wnd_size.width) / (2. * proj3D_cam[3]) + (_wnd_size.width * 0.5)
        # , ((proj3D_cam[1] / pt4d[3]) * _wnd_size.height) / (2. * proj3D_cam[3]) + (_wnd_size.height * 0.5)]

        proj2D = [((proj3D_cam[:, 0] / hom_obj_coords[:, 3]) * 1920) / (2. * proj3D_cam[:, 3]) + (1920 * 0.5)
                , ((proj3D_cam[:, 1] / hom_obj_coords[:, 3]) * 1172) / (2. * proj3D_cam[:, 3]) + ((1172 * 0.5) + ((1172 - 1080)*0.5))]
        proj2D_x = proj2D[0]
        proj2D_y = proj2D[1]

where

_cam_mat

is the projection matrix I got from the OpenGL code. But objects are not getting aligned properly, I think it would be great if the ZED SDK provides the support for this.

Use-Case

Saving the 3D bounding boxes in the pixel space will help to train any custom 3D object detection network without any associated point clouds.

Anything else?

No response

obraun-sl commented 2 years ago

Hi,

Best is to use the OpenCV projectPoints function as it is made for that : https://docs.opencv.org/3.4/d9/d0c/group__calib3d.html#ga1019495a2c8d1743ed5cc23fa0daff8c

The cameraMatrix is given with CameraInformation().calibration_parameters and R,T is the pose of the camera (if necessary).

fennecinspace commented 1 year ago

@obraun-sl I have tried doing this, but the rotation of the resulted bounding boxes is weird. @harishkool please share if you've found a solution to fix the wonky boxes.

tavasolireza commented 1 year ago

@fennecinspace Did you eventually solve this?