Yao-Shao / Waymo_Kitti_Adapter

A tool converting Waymo dataset format to Kitti dataset format.
MIT License
100 stars 30 forks source link

Unable to plot 3D bounding box #3

Open sarimmehdi opened 4 years ago

sarimmehdi commented 4 years ago

Hello, I am using the following code to plot 3D bounding boxes in image plane:

    box_3d = []
    center = label_info['bbox3d_loc']
    dims = label_info['bbox3d_dim']
    rot_y = label_info['rot_y']

    for i in [1, -1]:
        for j in [1, -1]:
            for k in [0, 1]:
                point = np.copy(center)
                point[1] = center[1] + i * dims[2] / 2 * np.cos(-rot_y + np.pi / 2) + (j * i) * dims[1] / 2 * np.cos(
                        -rot_y)
                point[0] = center[0] + i * dims[2] / 2 * np.sin(-rot_y + np.pi / 2) + (j * i) * dims[1] / 2 * np.sin(
                        -rot_y)
                point[2] = center[2] - k * dims[0]

                point = np.append(point, 1)
                point = np.dot(label_info['cam_to_img'], point)
                point = point[:2] / point[2]
                point = point.astype(np.int16)
                box_3d.append(point)
    point = np.copy(center)
    point[0] = center[0]
    point[2] = center[2]
    point[1] = center[1]
    point = np.append(point, 1)
    point = np.dot(label_info['cam_to_img'], point)
    point = point[:2] / point[2]
    point = point.astype(np.int16)
    box_3d.append(point)
    for i in range(4):
        point_1_ = box_3d[2 * i]
        point_2_ = box_3d[2 * i + 1]
        cv2.line(img, (point_1_[0], point_1_[1]), (point_2_[0], point_2_[1]), colors['pink'], 1)

    for i in range(8):
        point_1_ = box_3d[i]
        point_2_ = box_3d[(i + 2) % 8]
        cv2.line(img, (point_1_[0], point_1_[1]), (point_2_[0], point_2_[1]), colors['pink'], 1)
    cv2.circle(img, (box_3d[-1][0], box_3d[-1][1]), 10, colors['red'], -1)

However, the bounding boxes are displaced up and slightly to the left or right. Please let me know what I can do here. I am using the P0 intrinsic matrix (label_info['cam_to_img']) 0

Yao-Shao commented 4 years ago

Hi @sarimmehdi, I think it is caused by a wrong coordinate system. In Waymo's dataset, all 3d bbox data are in the vehicle frame, so the bbox needs to be transformed to camera frame(use the transformation matrix "Tr_velo_to_cam_0-4") before you draw bbox on the images.

sarimmehdi commented 4 years ago

Hi @Yao-Shao thank you. Can you write a small code to show how to do this? I just tried by multiplying with Tr_velo_to_cam_0 and then P0 but now the bounding box is closer to the ego car but at the correct height at least 0

This is the code I am using: (pts is a list of np.array points with shape (4,), so these are the points directly taken from the label file generated by your code with the last entry of each point being 1 and the first three entries being the x, y and z coordinates)

    P = calib['P0']
    R0 = np.eye(4)
    R0[:-1, :-1] = calib['R0_rect']
    Tr = calib['Tr_velo_to_cam_0']

    pts_array = np.array(pts).transpose()
    pts_array = pts_array[:, pts_array[-1, :] > 0]
    pts_array[-1, :] = 1
    A = np.matmul(Tr, pts_array)
    B = np.matmul(R0, A)
    mask = B[0, :] >= 0  # X >= 0, so points behind ego car are ignored
    B = B[:, mask]
    Y = np.matmul(P, B)
    pts2d = (Y / Y[2, :])[:-1, :].astype(int)

The output of this code is an np.array of shape (2,M) where M is equal to the number of points

EDIT: I also posted about this issue here: https://github.com/waymo-research/waymo-open-dataset/issues/107

I would like to ask why you are changing the format of the intrinsic matrix? Kitti's intrinsic matrix looks like this:

[f^2_u,  0,      c^2_u,    0;
 0,      f^2_v,  c^2_v,    0;
 0,      0,      1,        0]

But, in your code, the resulting intrinsic matrix has the columns shifted to different positions. This actually might be the reason for the incorrect 3D bounding boxes but I cannot figure out how to fix this.

minghanz commented 4 years ago

@sarimmehdi The reason that the intrinsic matrix is not of the format you mentioned above is that the coordinate system used in Waymo dataset is not right-down-front, but front-left-up. The former is what the intrinsic matrix you mentioned operates on.

sarimmehdi commented 4 years ago

@minghanz then what will be the fu, fv, cu and cv values for such a matrix? Also, even after doing that axis switching, the 3D bounding boxes are off by a slight margin

minghanz commented 4 years ago

@sarimmehdi I checked the 3D bbox projection and it seems right to me. The new intrinsic matrix is just K_new = np.matmul(K, np.array([0,-1,0,0, 0,0,-1,0, 1,0,0,0, 0 ,0 ,0 ,1]).reshape(4,4)), which is the same intrinsic matrix defined in different coordinates.

If you still have 3D box projections not well aligned with objects, it's likely there are some errors in your code of transformation and projection. (For example, I wrongly multiplied the rotation matrix of heading angle on 3D points in the vehicle frame at the beginning, which I should do in the bbox-centerred frame.) You can check out https://github.com/gdlg/simple-waymo-open-dataset-reader. Their implementation of 3D box visualization should work well.

RocketFlash commented 4 years ago

@sarimmehdi I have the same problem. I don't multiply intrinsic matrix by waymo_cam_RT in adapter.py and multiply objects coordinates to transformation matrix from vehicle to camera coordinate system (Tr_velo_to_cam) and after that multiply by transformation matrix from Waymo camera coordinate system to Kitti camera coordinate system (they have different axes order). And after that I got the following result:

Screenshot 2020-04-09 at 19 54 11

Have you solved this problem ?

RocketFlash commented 4 years ago

I found a solution and made changes in my fork of this repo. I transformed 3D bounding boxes into KITTI styled ( 3D bounding boxes in camera coordinate system). And now bounding boxes look good.

Screenshot 2020-04-13 at 20 49 26

@Yao-Shao maybe it is a good idea to make such an option for users? Maybe something like flag? I can do pull request for this option

RocketFlash commented 4 years ago

@z393 I am using https://github.com/kuixu/kitti_object_vis for visualization. Have you tried to visualize this sample without KITTI convertation ? Maybe problem is in shifted boxes due to interpolation based labelling. Try to visualize it with something like https://github.com/gdlg/simple-waymo-open-dataset-reader/blob/master/examples/visualise_labels_and_lidar.py

enesozi commented 4 years ago

Hi @RocketFlash, I tried your fork but I had two issues:

  1. Many Vehicle and Pedestrian samples are being saved as SIGN in label files.
  2. The position of the objects is correct only in the front camera(0,1,2) FOVs. In others, there is still quite a misalignment.

Did you have those issues as well?