isl-org / Open3D

Open3D: A Modern Library for 3D Data Processing
http://www.open3d.org
Other
11.2k stars 2.27k forks source link

point cloud creation from interiornet dataset is skewed #2151

Open ttsesm opened 4 years ago

ttsesm commented 4 years ago

Hi, I am trying to create a point cloud from the Interiornet RGBD images, following the code snippet below:

    import open3d as o3d

    color = o3d.io.read_image('./rgb.png')
    depth = o3d.io.read_image('./depth.png')
    intrinsics = o3d.io.read_pinhole_camera_intrinsic("./camera_intrinsics.json")
    rgbd = o3d.geometry.RGBDImage.create_from_color_and_depth(color, depth, depth_trunc=4.0, convert_rgb_to_intensity=False)
    pcd = o3d.geometry.PointCloud.create_from_rgbd_image(rgbd, o3d.camera.PinholeCameraIntrinsic(intrinsics))
    pcd.transform([[1, 0, 0, 0], [0, -1, 0, 0], [0, 0, -1, 0], [0, 0, 0, 1]])
    o3d.visualization.draw_geometries([pcd])

However, for some reason my point cloud is skewed: image

Any idea what could be wrong?

Environment:

I am attaching a pair of rgbd images and the intrinsics file. open3d_reconstruction_system.zip

ttsesm commented 4 years ago

From what I've noticed they also provide the extrinsic information as follows:

#timestamp [ns], position.x [m], position.y [m], position.z [m], quaternion.w [], quaternion.x [], quaternion.y [], quaternion.z []
0000000039991664640,0.0625990927,0.580219448,1.79874659,0.848812044,0.524193227,-0.0361736789,-0.0585750677

which I used to apply the transformation but still it did not work

import numpy as np

pose = np.array([39991664640, 0.0625990927, 0.580219448, 1.79874659, 0.848812044, 0.524193227, -0.0361736789, -0.0585750677])

R = o3d.geometry.get_rotation_matrix_from_quaternion(pose[4:])
Rt = np.vstack((np.hstack((R, pose[1:4].reshape(3,-1))), [0, 0, 0, 1]))

pcd.transform(Rt)
theNded commented 4 years ago

They are using an unconventional Euclidean depth (Section 3.3) instead of projective depth. You will have to convert it by yourself by modifying depth data inside rgbd.

ttsesm commented 4 years ago

@theNded thank you for the feedback. Can you provide a bit more info or an example how to apply this transformation.

theNded commented 4 years ago

I tried to directly use depth as Euclidean distance and create_from_nyu_format, but neither worked. I also checked https://interiornet.org/items/interiornet_format.pdf but didn't find any info regarding depth format. Any idea how is it processing depth, e.g. is there any toolbox that is using depth?

ttsesm commented 4 years ago

@theNded nope there is no information provided from the authors related to that in the .pdf file that you linking. Also I've tried to contact them but no response as well.

In any case, I've just found this repository https://github.com/ethz-asl/interiornet_to_rosbag where in this line https://github.com/ethz-asl/interiornet_to_rosbag/blob/1f43e446f8b58525eed9f91346b3d0da8ab54b5c/nodes/interiornet_to_rosbag.py#L58 they seem to apply this euclidiean to projective tranformation. I haven't tested it myself yet though. Once I find some time I will, and report back. Meanwhile if you have some time available you could test it yourself as well.

theNded commented 4 years ago

Thanks, in general you can use this snippet:

depthu = o3d.io.read_image('depth.png')
depthf = o3d.geometry.Image(depthu.astype(np.float32))

# process depthf using the code you've mentioned
depthf_processed = process(depthf)

rgbd = o3d.geometry.RGBDImage()
rgbd.color = color
rgbd.depth = depthf_processed
ttsesm commented 4 years ago

@theNded it seems that there is also an issue with the open3d image loader and loading depth values as Euclidean depth. For example the following code would still fail to give the correct point cloud:

import open3d as o3d
import numpy as np

def euclidean_ray_length_to_z_coordinate(depth_image, camera_model):
    center_x = camera_model.get_principal_point()[0]
    center_y = camera_model.get_principal_point()[1]

    constant_x = 1 / camera_model.get_focal_length()[0]
    constant_y = 1 / camera_model.get_focal_length()[1]

    vs = np.array([(v - center_x) * constant_x for v in range(0, depth_image.shape[1])])
    us = np.array([(u - center_y) * constant_y for u in range(0, depth_image.shape[0])])

    return (np.sqrt(np.square(depth_image / 1000.0) / (1 + np.square(vs[np.newaxis, :]) + np.square(us[:, np.newaxis]))) * 1000.0).astype(np.uint16)

if __name__ == '__main__':
    camera_intrinsics = o3d.camera.PinholeCameraIntrinsic()
    camera_intrinsics.set_intrinsics(640, 480, 600.0, 600.0, 320.0, 240.0)

    color = o3d.io.read_image('color.png')
    depthu = o3d.io.read_image('depth.png')
    depthf = o3d.geometry.Image(np.asarray(depthu).astype(np.float32))

    # process depthf using the euclidean_2_z function
    depthf_processed = o3d.geometry.Image(euclidiean_ray_lenght_to_z_coordinate(np.asarray(depthf), camera_intrinsics))

    # create rgbd image
    rgbd = o3d.geometry.RGBDImage.create_from_color_and_depth(color, depthf_processed, depth_trunc=4.0, convert_rgb_to_intensity=False)

    # create pointcloud from rgbd image
    pcd = o3d.geometry.PointCloud.create_from_rgbd_image(rgbd, camera_intrinsics)

    # visualize point cloud
    o3d.visualization.draw_geometries([pcd])

however, if I replace depthu = o3d.io.read_image('depth.png') with depthu = cv2.imread('depth.png', cv2.IMREAD_UNCHANGED) then I am getting the correct output. I do not know if this could be considered as a bug or a missing feature, but I just wanted to let you know.

theNded commented 4 years ago

@ttsesm Thanks for the feedback, this is very interesting. It is possible that this image uses a non-standard png format that Open3D does not recognize -- it succeeded to detected the 16-bit format, but somehow read only 8-bit values.

Please keep using OpenCV for loading depth now, will let you know if there is any update on this :-)

ghost commented 3 years ago

@theNded it seems that there is also an issue with the open3d image loader and loading depth values as Euclidean depth. For example the following code would still fail to give the correct point cloud:

import open3d as o3d
import numpy as np

def euclidean_ray_length_to_z_coordinate(depth_image, camera_model):
    center_x = camera_model.get_principal_point()[0]
    center_y = camera_model.get_principal_point()[1]

    constant_x = 1 / camera_model.get_focal_length()[0]
    constant_y = 1 / camera_model.get_focal_length()[1]

    vs = np.array([(v - center_x) * constant_x for v in range(0, depth_image.shape[1])])
    us = np.array([(u - center_y) * constant_y for u in range(0, depth_image.shape[0])])

    return (np.sqrt(np.square(depth_image / 1000.0) / (1 + np.square(vs[np.newaxis, :]) + np.square(us[:, np.newaxis]))) * 1000.0).astype(np.uint16)

if __name__ == '__main__':
    camera_intrinsics = o3d.camera.PinholeCameraIntrinsic()
    camera_intrinsics.set_intrinsics(640, 480, 600.0, 600.0, 320.0, 240.0)

    color = o3d.io.read_image('color.png')
    depthu = o3d.io.read_image('depth.png')
    depthf = o3d.geometry.Image(np.asarray(depthu).astype(np.float32))

    # process depthf using the euclidean_2_z function
    depthf_processed = o3d.geometry.Image(euclidiean_ray_lenght_to_z_coordinate(np.asarray(depthf), camera_intrinsics))

    # create rgbd image
    rgbd = o3d.geometry.RGBDImage.create_from_color_and_depth(color, depthf_processed, depth_trunc=4.0, convert_rgb_to_intensity=False)

    # create pointcloud from rgbd image
    pcd = o3d.geometry.PointCloud.create_from_rgbd_image(rgbd, camera_intrinsics)

    # visualize point cloud
    o3d.visualization.draw_geometries([pcd])

however, if I replace depthu = o3d.io.read_image('depth.png') with depthu = cv2.imread('depth.png', cv2.IMREAD_UNCHANGED) then I am getting the correct output. I do not know if this could be considered as a bug or a missing feature, but I just wanted to let you know.

in euclidean_ray_length_to_z_coordinate a = (np.sqrt(np.square(depth_image / 1000.0) / (1 + np.square(vs[np.newaxis, :]) + np.square(us[:, np.newaxis]))) * 1000.0) ValueError: operands could not be broadcast together with shapes (768,576,3) (768,576) it is getting and error because they are dividing 3-dimensional array to 2 - dimensional array....still are you getting correct output....?