city-super / MatrixCity

Apache License 2.0
218 stars 9 forks source link

Noise in depth gt #4

Open BlueNine9 opened 1 year ago

BlueNine9 commented 1 year ago

Thank you for your excellent work!

I've downloaded the test set from the website and converted the depth image into point cloud using the following code:

import os
import numpy as np
from numpy.matlib import repmat

os.environ['OPENCV_IO_ENABLE_OPENEXR'] = '1'

import cv2

image = cv2.imread('0000.exr', cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)[..., 0]

pixel_size = image.shape[0] * image.shape[1]

depth_image = image.reshape(pixel_size)
u_coord = repmat(np.r_[0:image.shape[1]:1], image.shape[0], 1).reshape(pixel_size)
v_coord = repmat(np.c_[0:image.shape[0]:1], 1, image.shape[1]).reshape(pixel_size)

fx = 499.9999781443055
fy = 499.9999781443055
cx = image.shape[1] / 2
cy = image.shape[0] / 2

k = np.identity(3)
k[0, 2] = cx
k[1, 2] = cy
k[0, 0] = k[1, 1] = fx

p2d = np.array([u_coord, v_coord, np.ones_like(u_coord)])

p3d = np.transpose(np.dot(np.linalg.inv(k), p2d) * depth_image).reshape(-1, 3)

point_cloud = []

for x, y, z in p3d:
    x, y, z = z, x, -y
    point_cloud.append([x, y, z])

point_cloud = np.array(point_cloud)

from plyfile import PlyData, PlyElement

vertices = np.array([(x, y, z) for x, y, z in point_cloud.reshape(-1, 3)], dtype=[('x', 'f4'), ('y', 'f4'), ('z', 'f4')])
plydata = PlyData([PlyElement.describe(vertices, 'vertex')], text=False)
plydata.write('point_cloud.ply')

The shape of the point cloud seemed nice, however there seemed to be a noticeable amount of noise in the point cloud.

Is this normal or am I using a wrong way to convert the exr file? I'm opening 0000.exr from small_city_road_horizon_test_depth.tar.

772a50d5115886cb6fda6974f7731e9 619759db348704016bfbd173e5df108 0f399001c8f150fcda0e6b104b01046

yixuanli98 commented 1 year ago

Sorry for the late reply. Have you tried depth from other view? Did you observe similar phenomenon?

yixuanli98 commented 1 year ago

Thank you very much for you question! I think the way you read depth is right. We have checked our depth map and find the reason. For the best rendering quality, we use the anti-alias setting to reduce noise, which will have a smoothing effect on the image. The depths change dramatically at the edge of the object and anti-alias reduces this change, which will cause the noisy point cloud. The depth map without and with anti-alias are shown below, which is a zoom-in pixel-level comparison. 屏幕截图 2023-10-22 180538

We think the depth maps rendered with anti-lias can better align the rgb images rendered with anti-alias and can support most of the requirements, like depth guidance for reconstruction. But for generating point cloud from depth map, anti-alias will bring many noises like you find. We have tried generating point cloud from the depth rendered without anti-alias and solved the problem. We will update a new depth map version without anti-alias soon. 屏幕截图 2023-10-20 220900

ChaoyiZh commented 12 months ago

Hi Guys Can I ask why the depth value from the .exr may has thousands? Is there any scaling for it? image

yixuanli98 commented 12 months ago

The unit is cm.

yixuanli98 commented 12 months ago

The unit in our pose file (street/pose and aerial/pose) is 10000cm. If you use the pose file to train nerf, you can scale the depth file with 10000.

ChaoyiZh commented 12 months ago

I am trying to fuse serval RGB and depth pairs to generate the pointclouds. Here is my core code and please feel free to point out if there is any problems. According to the instruction from YiXuan and previous post, I divided the depth with 10000 and did the same coordinate conversion. But the results seems wired. image

os.environ['OPENCV_IO_ENABLE_OPENEXR'] = '1'
def depth_to_pc(depth_path, intrinsic, transform_matrix):
    depth_map = cv2.imread(depth_path, cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)[..., 0]
    pixel_size = depth_map.shape[0] * depth_map.shape[1]
    depth_image = depth_map.reshape(pixel_size)/10000
    u_coord = np.tile(np.arange(depth_map.shape[1]), (depth_map.shape[0], 1)).reshape(pixel_size)
    v_coord = np.tile(np.arange(depth_map.shape[0]), (depth_map.shape[1], 1)).T.reshape(pixel_size)
    k = np.identity(3)
    k[0, 2] = intrinsic["cx"]
    k[1, 2] = intrinsic["cy"]
    k[0, 0] = intrinsic["fl_x"]
    k[1, 1] = intrinsic["fl_x"]
    p2d = np.array([u_coord, v_coord, np.ones_like(u_coord)])
    p3d = np.transpose(np.dot(np.linalg.inv(k), p2d) * depth_image).reshape(-1, 3)

    # Left coord to right
    point_cloud = p3d
    point_cloud[:, 1] = -point_cloud[:, 1]
    ones = np.ones((point_cloud.shape[0], 1))
    homogenous_coordinate = np.hstack((point_cloud, ones))
    transformed_points = (transform_matrix @ homogenous_coordinate.T).T

    return transformed_points[:, :3]
frames = data["frames"]

for frame in frames[:10]:
    rel_img_path = frame["file_path"]
    transform_matrix = frame["transform_matrix"]
    transform_matrix = np.array(transform_matrix)

    abs_img_path = os.path.abspath(os.path.join(pose_path, rel_img_path)).replace("/pose", "")
    abs_depth_path = abs_img_path.replace("small_city", "small_city_depth",1).replace("small_city_road_down", "small_city_road_down_depth").replace("png", "exr")
    if not os.path.exists(abs_img_path) or not os.path.exists(abs_depth_path):
        continue
    depth_image = cv2.imread(abs_depth_path, cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)[..., 0]
    color_image = cv2.imread(abs_img_path)
    color_image = cv2.cvtColor(color_image, cv2.COLOR_BGR2RGB)
    point_cloud = depth_to_pc(abs_depth_path, camera_intrinsics, transform_matrix)

    all_points.append(point_cloud)

    colors = color_image.reshape(-1, 3)/255.0
    all_colors.append(colors)

merged_points = np.vstack(all_points)
merged_colors = np.vstack(all_colors)

final_pcd = o3d.geometry.PointCloud()
final_pcd.points = o3d.utility.Vector3dVector(merged_points)
final_pcd.colors = o3d.utility.Vector3dVector(merged_colors)
o3d.io.write_point_cloud("final_merged_point_cloud_2.pcd", final_pcd)
Goochaozheng commented 11 months ago

Same problem as @ChaoyiZh. Is there any solution?

yixuanli98 commented 11 months ago

I am trying to fuse serval RGB and depth pairs to generate the pointclouds. Here is my core code and please feel free to point out if there is any problems. According to the instruction from YiXuan and previous post, I divided the depth with 10000 and did the same coordinate conversion. But the results seems wired. image

os.environ['OPENCV_IO_ENABLE_OPENEXR'] = '1'
def depth_to_pc(depth_path, intrinsic, transform_matrix):
    depth_map = cv2.imread(depth_path, cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)[..., 0]
    pixel_size = depth_map.shape[0] * depth_map.shape[1]
    depth_image = depth_map.reshape(pixel_size)/10000
    u_coord = np.tile(np.arange(depth_map.shape[1]), (depth_map.shape[0], 1)).reshape(pixel_size)
    v_coord = np.tile(np.arange(depth_map.shape[0]), (depth_map.shape[1], 1)).T.reshape(pixel_size)
    k = np.identity(3)
    k[0, 2] = intrinsic["cx"]
    k[1, 2] = intrinsic["cy"]
    k[0, 0] = intrinsic["fl_x"]
    k[1, 1] = intrinsic["fl_x"]
    p2d = np.array([u_coord, v_coord, np.ones_like(u_coord)])
    p3d = np.transpose(np.dot(np.linalg.inv(k), p2d) * depth_image).reshape(-1, 3)

    # Left coord to right
    point_cloud = p3d
    point_cloud[:, 1] = -point_cloud[:, 1]
    ones = np.ones((point_cloud.shape[0], 1))
    homogenous_coordinate = np.hstack((point_cloud, ones))
    transformed_points = (transform_matrix @ homogenous_coordinate.T).T

    return transformed_points[:, :3]
frames = data["frames"]

for frame in frames[:10]:
    rel_img_path = frame["file_path"]
    transform_matrix = frame["transform_matrix"]
    transform_matrix = np.array(transform_matrix)

    abs_img_path = os.path.abspath(os.path.join(pose_path, rel_img_path)).replace("/pose", "")
    abs_depth_path = abs_img_path.replace("small_city", "small_city_depth",1).replace("small_city_road_down", "small_city_road_down_depth").replace("png", "exr")
    if not os.path.exists(abs_img_path) or not os.path.exists(abs_depth_path):
        continue
    depth_image = cv2.imread(abs_depth_path, cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)[..., 0]
    color_image = cv2.imread(abs_img_path)
    color_image = cv2.cvtColor(color_image, cv2.COLOR_BGR2RGB)
    point_cloud = depth_to_pc(abs_depth_path, camera_intrinsics, transform_matrix)

    all_points.append(point_cloud)

    colors = color_image.reshape(-1, 3)/255.0
    all_colors.append(colors)

merged_points = np.vstack(all_points)
merged_colors = np.vstack(all_colors)

final_pcd = o3d.geometry.PointCloud()
final_pcd.points = o3d.utility.Vector3dVector(merged_points)
final_pcd.colors = o3d.utility.Vector3dVector(merged_colors)
o3d.io.write_point_cloud("final_merged_point_cloud_2.pcd", final_pcd)

Have you successfully generated the reasonable point cloud from a single RGB-depth pairs?

yixuanli98 commented 11 months ago

Besides, could you tell me the which json file you use?

ChaoyiZh commented 11 months ago

Besides, could you tell me the which json file you use?

Yes. I totally understand we are using Nerf(OPENGL), which is a right-hand coordinate system. And I can generate one pointcloud with some image-depth pair and the corresponding pose. Here are the files I am using. small_city/street/pose/block_A/transforms_train.json. The result I posted before was from the fusion of first ten images.

yixuanli98 commented 11 months ago

Thx! I will try to fix this bug.

Snosixtyboo commented 11 months ago

I am trying to fuse serval RGB and depth pairs to generate the pointclouds. Here is my core code and please feel free to point out if there is any problems. According to the instruction from YiXuan and previous post, I divided the depth with 10000 and did the same coordinate conversion. But the results seems wired. image

os.environ['OPENCV_IO_ENABLE_OPENEXR'] = '1'
def depth_to_pc(depth_path, intrinsic, transform_matrix):
    depth_map = cv2.imread(depth_path, cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)[..., 0]
    pixel_size = depth_map.shape[0] * depth_map.shape[1]
    depth_image = depth_map.reshape(pixel_size)/10000
    u_coord = np.tile(np.arange(depth_map.shape[1]), (depth_map.shape[0], 1)).reshape(pixel_size)
    v_coord = np.tile(np.arange(depth_map.shape[0]), (depth_map.shape[1], 1)).T.reshape(pixel_size)
    k = np.identity(3)
    k[0, 2] = intrinsic["cx"]
    k[1, 2] = intrinsic["cy"]
    k[0, 0] = intrinsic["fl_x"]
    k[1, 1] = intrinsic["fl_x"]
    p2d = np.array([u_coord, v_coord, np.ones_like(u_coord)])
    p3d = np.transpose(np.dot(np.linalg.inv(k), p2d) * depth_image).reshape(-1, 3)

    # Left coord to right
    point_cloud = p3d
    point_cloud[:, 1] = -point_cloud[:, 1]
    ones = np.ones((point_cloud.shape[0], 1))
    homogenous_coordinate = np.hstack((point_cloud, ones))
    transformed_points = (transform_matrix @ homogenous_coordinate.T).T

    return transformed_points[:, :3]
frames = data["frames"]

for frame in frames[:10]:
    rel_img_path = frame["file_path"]
    transform_matrix = frame["transform_matrix"]
    transform_matrix = np.array(transform_matrix)

    abs_img_path = os.path.abspath(os.path.join(pose_path, rel_img_path)).replace("/pose", "")
    abs_depth_path = abs_img_path.replace("small_city", "small_city_depth",1).replace("small_city_road_down", "small_city_road_down_depth").replace("png", "exr")
    if not os.path.exists(abs_img_path) or not os.path.exists(abs_depth_path):
        continue
    depth_image = cv2.imread(abs_depth_path, cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)[..., 0]
    color_image = cv2.imread(abs_img_path)
    color_image = cv2.cvtColor(color_image, cv2.COLOR_BGR2RGB)
    point_cloud = depth_to_pc(abs_depth_path, camera_intrinsics, transform_matrix)

    all_points.append(point_cloud)

    colors = color_image.reshape(-1, 3)/255.0
    all_colors.append(colors)

merged_points = np.vstack(all_points)
merged_colors = np.vstack(all_colors)

final_pcd = o3d.geometry.PointCloud()
final_pcd.points = o3d.utility.Vector3dVector(merged_points)
final_pcd.colors = o3d.utility.Vector3dVector(merged_colors)
o3d.io.write_point_cloud("final_merged_point_cloud_2.pcd", final_pcd)

I think you wanna do

point_cloud[:, 2] = -point_cloud[:, 2]

For me, that makes the points align as expected

yzslab commented 10 months ago

I am trying to fuse serval RGB and depth pairs to generate the pointclouds. Here is my core code and please feel free to point out if there is any problems. According to the instruction from YiXuan and previous post, I divided the depth with 10000 and did the same coordinate conversion. But the results seems wired.

Hi, have you solve this problem? I am trying to build point cloud too, but face the same problem as you.

yzslab commented 10 months ago

I think you wanna do

point_cloud[:, 2] = -point_cloud[:, 2]

For me, that makes the points align as expected

Hi, would you mind sharing more details about your implementation? I tried point_cloud[:, 2] = -point_cloud[:, 2] but it doesn't work.

yzslab commented 10 months ago

I think you wanna do

point_cloud[:, 2] = -point_cloud[:, 2]

For me, that makes the points align as expected

Thanks, with below operations, the points align successfully:

point_cloud[:, 1] = -point_cloud[:, 1]
point_cloud[:, 2] = -point_cloud[:, 2]
guangkaixu commented 9 months ago

Thank you very much for you question! I think the way you read depth is right. We have checked our depth map and find the reason. For the best rendering quality, we use the anti-alias setting to reduce noise, which will have a smoothing effect on the image. The depths change dramatically at the edge of the object and anti-alias reduces this change, which will cause the noisy point cloud. The depth map without and with anti-alias are shown below, which is a zoom-in pixel-level comparison. 屏幕截图 2023-10-22 180538

We think the depth maps rendered with anti-lias can better align the rgb images rendered with anti-alias and can support most of the requirements, like depth guidance for reconstruction. But for generating point cloud from depth map, anti-alias will bring many noises like you find. We have tried generating point cloud from the depth rendered without anti-alias and solved the problem. We will update a new depth map version without anti-alias soon. 屏幕截图 2023-10-20 220900

Hi, is there any update on the ground-truth depth map? It seems that the update date of "small_city_depth" is still 3 months ago. Besides, has the anti-alias problem been solved in the "big_city_depth"?

yixuanli98 commented 9 months ago

Thank you very much for you question! I think the way you read depth is right. We have checked our depth map and find the reason. For the best rendering quality, we use the anti-alias setting to reduce noise, which will have a smoothing effect on the image. The depths change dramatically at the edge of the object and anti-alias reduces this change, which will cause the noisy point cloud. The depth map without and with anti-alias are shown below, which is a zoom-in pixel-level comparison. 屏幕截图 2023-10-22 180538 We think the depth maps rendered with anti-lias can better align the rgb images rendered with anti-alias and can support most of the requirements, like depth guidance for reconstruction. But for generating point cloud from depth map, anti-alias will bring many noises like you find. We have tried generating point cloud from the depth rendered without anti-alias and solved the problem. We will update a new depth map version without anti-alias soon. 屏幕截图 2023-10-20 220900

Hi, is there any update on the ground-truth depth map? It seems that the update date of "small_city_depth" is still 3 months ago. Besides, has the anti-alias problem been solved in the "big_city_depth"?

Currently, we do not provide the depth map without anti-alias. If you want to use the depth map to guide reconstruction, I think the depth map with anti-alias is more suitable because it aligns with rgb images. If you want to use the depth map to obtain geometry, you can refer to https://github.com/city-super/MatrixCity/blob/main/MatrixCityPlugin/docs/Render-Data.md#depth to render the depth map without anti-alias, we have already provided the camera sequences, render config, plugin and detailed documentation.

yixuanli98 commented 6 months ago

we have updated the script extracting point clouds from multi rgb-depth pairs at https://github.com/city-super/MatrixCity/blob/main/scripts/rgbd2pc.py @ChaoyiZh @yzslab @Goochaozheng @Snosixtyboo