Closed choiszt closed 1 year ago
Hi @choiszt,
I think you may be misinterpretting our transformation notation here:
cam2dev=gt_provider.get_aria_transform_device_camera(stream_id).inverse().to_matrix()
dev2sce=gt_provider.get_aria_3d_pose_by_timestamp_ns(STAMP).data().transform_scene_device.inverse().to_matrix()
Please see our documentation on frame convention here
tl;dr:
gt_provider.get_aria_transform_device_camera(stream_id) returns the transformation from camera to device. I believe you have the inverse in your example code.
Hello @nickcharron, really appreciate for your response! So I guess the correct transformations are:
T_Dev_Cam=gt_provider.get_aria_transform_device_camera(stream_id).to_matrix() #T_Device_Camera
T_Sce_Dev=gt_provider.get_aria_3d_pose_by_timestamp_ns(STAMP).data().transform_scene_device.to_matrix() #T_scene_device
T_Sce_Cam=T_Sce_Dev@T_Dev_Cam #transform from Cam to Sce
pcd_in_Scene = (T_Sce_Cam@ homogenous_coords.T).T
But still have mismatch in Meshlab
Those transform definitions look correct to me. I suspect there's still a transform error somewhere in your code.
I also noticed it looks like you're creating your own unprojection assuming a pinhole camera model, which won't be very accurate unless you undistort the images first. I recommend using the unproject function we have in our calibration API.
Let me know if this fixes it. In the meantime, I will try to make an example for generating pointclouds. Hopefully this will help.
Hi @nickcharron ! I'm not sure if I've made myself clear. The unproject function seems to project an x,y tensor(with the shape [2,1]) to the 3D coordinate. I suppose projecting each pixel in images to the 3D coordinate as a time-consuming operation. I do have some inspiration from the "undistort" concept, and I implement as follows:
def undistort(image):
sensor_name = gt_provider.raw_data_provider_ptr().get_label_from_stream_id(stream_id)
device_calib = gt_provider.raw_data_provider_ptr().get_device_calibration()
src_calib = device_calib.get_camera_calib(sensor_name)
dst_calib = calibration.get_linear_camera_calibration(1408, 1408, calibration_provider.get_focal_lengths()[0], sensor_name)
# rectify image
rectified_image = calibration.distort_by_calibration(image, dst_calib, src_calib)
return rectified_image
then I use the undistort RGB and depth data to generate pointcloud.
def rgbd_to_ply(rgb_path, depth_path, ply_path,fx,fy,cx,cy):
rgb = undistort(cv2.imread(rgb_path))
depth = undistort(cv2.imread(depth_path, cv2.IMREAD_UNCHANGED))
h, w= depth.shape
fx = fx
fy = fy
cx = cx
cy = cy
vertices = []
for y in tqdm(range(h)):
for x in range(w):
Z = depth[y, x] / 1000.0
X = (x - cx) * Z / fx
Y = (y - cy) * Z / fy
color = rgb[y, x]
vertices.append((X, Y, Z, color[2], color[1], color[0]))
vertex = np.array(vertices, dtype=[('x', 'f4'), ('y', 'f4'),
('z', 'f4'), ('red', 'u1'),
('green', 'u1'), ('blue', 'u1')])
el = PlyElement.describe(vertex, 'vertex')
PlyData([el]).write(ply_path)
And I suppose the intrinsic matrix are compatible to the undistort RGBD. But unfortunately, the mismatching still occur. BTW the point of pinhole radius occurred, I guess distort_by_calibration may have something related to unproject function
Hi @choiszt, thank you for asking. What are your intrinsics specifically? Since you undistorted the image to a pinhole camera model with image width = 1408, height = 1408 and focal = calibration_provider.get_focal_lengths()[0]. Your fx, fy should equal calibration_provider.get_focal_lengths()[0] and cx,cy = (1408 - 1) / 2.
The function distort_by_calibration
just unprojects all the pixels to ray and reproject back with the dst_calib, which in your case is the pinhole camera model. This function is speeded up in the c++ implementation but you can also try the unproject function in the src_calib without undistortion to debug where the issue is.
Plus, if you could share your complete code, it might be easier to check where could be the error.
@choiszt I also created an example notebook that you can follow.
One thing to make sure (a mistake I made myself when creating this example) is not to use the get_visualizable() function on the depth images, as that changes the units.
Hope this helps!
@choiszt I also created an example notebook that you can follow.
One thing to make sure (a mistake I made myself when creating this example) is not to use the get_visualizable() function on the depth images, as that changes the units.
Hope this helps!
Hi @nickcharron ! I've also noticed the get_visualizable() issue when I tried to convert and save data, I'll check out your examples and really appreciate your effortless help!
Closing this issue, but please feel free to re-open if you have any more questions. Thanks!
Hi, quick related question here. From what I understand, undistorting a depth map with distort_by_calibration
will only modify the location of depth pixels, not the depth values themselves. That's why after undistortion we still cannot produce a pointcloud from the rectified depth map using pinhole intrinsics. Is there a way to return the updated depth values during the undistortion process? Or produce those some other way?
Thank you very much!
Hi @VitorGuizilini-TRI,
Undistorting a depth image is similar to undistoring an RGB image. Using the undistorted image with the pinhole model will result in back projecting the same pixel values to the same physical 3d coordinate as doing the same with the original image + original calibration. So yes, you should be able to produce a pointcloud using the undistorted image with a pinhole model.
The undistortion process does require some slight editing of the pixel values because it needs to fill empty pixels. For those pixels, we provide the option to use a bilinear interpolation or pick the nearest neighbor for getting the depth of that pixel. This will introduce some error. The only way to completely remove that error would be for us to re-render the scene and create new depth maps using a pinhole camera model which would be a significant amount of work.
Hi team! Currently I'm reconstructing the 3D pointcloud scene based on your brilliant guidance in adt_quickstart_tutorial.ipynb. However, it comes to some issues about converting matrices when I was implementing the pointcloud conversion.
Firstly, I supposed that the depth data are in the camera coordinate based on
gt_provider.get_depth_image_by_timestamp_ns(timestamp, stream_id)
and then I acquire the intrinsic data fromin order to convert rgbd data to pointcloud as follows:
The next step I want to align pointclouds in each frame to the same world coorindate, so I was trying to acquire the extrinsic matrices:
and I supposed
transform_matrix=cam2dev@dev2sce
is the exactly extrinsic matrix.Finally, I implement the alignment in world coordinate by `def transform_ply_with_matrix(ply_path, transform_matrix, output_path): ply_data = PlyData.read(ply_path) vertex_data = np.vstack([ply_data['vertex']['x'], ply_data['vertex']['y'], ply_data['vertex']['z']]).T
but the mismatch encountered when I tried to visualize two ply in Meshlab.