Open BlueNine9 opened 1 year ago
Sorry for the late reply. Have you tried depth from other view? Did you observe similar phenomenon?
Thank you very much for you question! I think the way you read depth is right. We have checked our depth map and find the reason. For the best rendering quality, we use the anti-alias setting to reduce noise, which will have a smoothing effect on the image. The depths change dramatically at the edge of the object and anti-alias reduces this change, which will cause the noisy point cloud. The depth map without and with anti-alias are shown below, which is a zoom-in pixel-level comparison.
We think the depth maps rendered with anti-lias can better align the rgb images rendered with anti-alias and can support most of the requirements, like depth guidance for reconstruction. But for generating point cloud from depth map, anti-alias will bring many noises like you find. We have tried generating point cloud from the depth rendered without anti-alias and solved the problem. We will update a new depth map version without anti-alias soon.
Hi Guys Can I ask why the depth value from the .exr may has thousands? Is there any scaling for it?
The unit is cm.
The unit in our pose file (street/pose and aerial/pose) is 10000cm. If you use the pose file to train nerf, you can scale the depth file with 10000.
I am trying to fuse serval RGB and depth pairs to generate the pointclouds. Here is my core code and please feel free to point out if there is any problems. According to the instruction from YiXuan and previous post, I divided the depth with 10000 and did the same coordinate conversion. But the results seems wired.
os.environ['OPENCV_IO_ENABLE_OPENEXR'] = '1'
def depth_to_pc(depth_path, intrinsic, transform_matrix):
depth_map = cv2.imread(depth_path, cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)[..., 0]
pixel_size = depth_map.shape[0] * depth_map.shape[1]
depth_image = depth_map.reshape(pixel_size)/10000
u_coord = np.tile(np.arange(depth_map.shape[1]), (depth_map.shape[0], 1)).reshape(pixel_size)
v_coord = np.tile(np.arange(depth_map.shape[0]), (depth_map.shape[1], 1)).T.reshape(pixel_size)
k = np.identity(3)
k[0, 2] = intrinsic["cx"]
k[1, 2] = intrinsic["cy"]
k[0, 0] = intrinsic["fl_x"]
k[1, 1] = intrinsic["fl_x"]
p2d = np.array([u_coord, v_coord, np.ones_like(u_coord)])
p3d = np.transpose(np.dot(np.linalg.inv(k), p2d) * depth_image).reshape(-1, 3)
# Left coord to right
point_cloud = p3d
point_cloud[:, 1] = -point_cloud[:, 1]
ones = np.ones((point_cloud.shape[0], 1))
homogenous_coordinate = np.hstack((point_cloud, ones))
transformed_points = (transform_matrix @ homogenous_coordinate.T).T
return transformed_points[:, :3]
frames = data["frames"]
for frame in frames[:10]:
rel_img_path = frame["file_path"]
transform_matrix = frame["transform_matrix"]
transform_matrix = np.array(transform_matrix)
abs_img_path = os.path.abspath(os.path.join(pose_path, rel_img_path)).replace("/pose", "")
abs_depth_path = abs_img_path.replace("small_city", "small_city_depth",1).replace("small_city_road_down", "small_city_road_down_depth").replace("png", "exr")
if not os.path.exists(abs_img_path) or not os.path.exists(abs_depth_path):
continue
depth_image = cv2.imread(abs_depth_path, cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)[..., 0]
color_image = cv2.imread(abs_img_path)
color_image = cv2.cvtColor(color_image, cv2.COLOR_BGR2RGB)
point_cloud = depth_to_pc(abs_depth_path, camera_intrinsics, transform_matrix)
all_points.append(point_cloud)
colors = color_image.reshape(-1, 3)/255.0
all_colors.append(colors)
merged_points = np.vstack(all_points)
merged_colors = np.vstack(all_colors)
final_pcd = o3d.geometry.PointCloud()
final_pcd.points = o3d.utility.Vector3dVector(merged_points)
final_pcd.colors = o3d.utility.Vector3dVector(merged_colors)
o3d.io.write_point_cloud("final_merged_point_cloud_2.pcd", final_pcd)
Same problem as @ChaoyiZh. Is there any solution?
I am trying to fuse serval RGB and depth pairs to generate the pointclouds. Here is my core code and please feel free to point out if there is any problems. According to the instruction from YiXuan and previous post, I divided the depth with 10000 and did the same coordinate conversion. But the results seems wired.
os.environ['OPENCV_IO_ENABLE_OPENEXR'] = '1' def depth_to_pc(depth_path, intrinsic, transform_matrix): depth_map = cv2.imread(depth_path, cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)[..., 0] pixel_size = depth_map.shape[0] * depth_map.shape[1] depth_image = depth_map.reshape(pixel_size)/10000 u_coord = np.tile(np.arange(depth_map.shape[1]), (depth_map.shape[0], 1)).reshape(pixel_size) v_coord = np.tile(np.arange(depth_map.shape[0]), (depth_map.shape[1], 1)).T.reshape(pixel_size) k = np.identity(3) k[0, 2] = intrinsic["cx"] k[1, 2] = intrinsic["cy"] k[0, 0] = intrinsic["fl_x"] k[1, 1] = intrinsic["fl_x"] p2d = np.array([u_coord, v_coord, np.ones_like(u_coord)]) p3d = np.transpose(np.dot(np.linalg.inv(k), p2d) * depth_image).reshape(-1, 3) # Left coord to right point_cloud = p3d point_cloud[:, 1] = -point_cloud[:, 1] ones = np.ones((point_cloud.shape[0], 1)) homogenous_coordinate = np.hstack((point_cloud, ones)) transformed_points = (transform_matrix @ homogenous_coordinate.T).T return transformed_points[:, :3] frames = data["frames"] for frame in frames[:10]: rel_img_path = frame["file_path"] transform_matrix = frame["transform_matrix"] transform_matrix = np.array(transform_matrix) abs_img_path = os.path.abspath(os.path.join(pose_path, rel_img_path)).replace("/pose", "") abs_depth_path = abs_img_path.replace("small_city", "small_city_depth",1).replace("small_city_road_down", "small_city_road_down_depth").replace("png", "exr") if not os.path.exists(abs_img_path) or not os.path.exists(abs_depth_path): continue depth_image = cv2.imread(abs_depth_path, cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)[..., 0] color_image = cv2.imread(abs_img_path) color_image = cv2.cvtColor(color_image, cv2.COLOR_BGR2RGB) point_cloud = depth_to_pc(abs_depth_path, camera_intrinsics, transform_matrix) all_points.append(point_cloud) colors = color_image.reshape(-1, 3)/255.0 all_colors.append(colors) merged_points = np.vstack(all_points) merged_colors = np.vstack(all_colors) final_pcd = o3d.geometry.PointCloud() final_pcd.points = o3d.utility.Vector3dVector(merged_points) final_pcd.colors = o3d.utility.Vector3dVector(merged_colors) o3d.io.write_point_cloud("final_merged_point_cloud_2.pcd", final_pcd)
Have you successfully generated the reasonable point cloud from a single RGB-depth pairs?
Besides, could you tell me the which json file you use?
Besides, could you tell me the which json file you use?
Yes. I totally understand we are using Nerf(OPENGL), which is a right-hand coordinate system. And I can generate one pointcloud with some image-depth pair and the corresponding pose. Here are the files I am using. small_city/street/pose/block_A/transforms_train.json. The result I posted before was from the fusion of first ten images.
Thx! I will try to fix this bug.
I am trying to fuse serval RGB and depth pairs to generate the pointclouds. Here is my core code and please feel free to point out if there is any problems. According to the instruction from YiXuan and previous post, I divided the depth with 10000 and did the same coordinate conversion. But the results seems wired.
os.environ['OPENCV_IO_ENABLE_OPENEXR'] = '1' def depth_to_pc(depth_path, intrinsic, transform_matrix): depth_map = cv2.imread(depth_path, cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)[..., 0] pixel_size = depth_map.shape[0] * depth_map.shape[1] depth_image = depth_map.reshape(pixel_size)/10000 u_coord = np.tile(np.arange(depth_map.shape[1]), (depth_map.shape[0], 1)).reshape(pixel_size) v_coord = np.tile(np.arange(depth_map.shape[0]), (depth_map.shape[1], 1)).T.reshape(pixel_size) k = np.identity(3) k[0, 2] = intrinsic["cx"] k[1, 2] = intrinsic["cy"] k[0, 0] = intrinsic["fl_x"] k[1, 1] = intrinsic["fl_x"] p2d = np.array([u_coord, v_coord, np.ones_like(u_coord)]) p3d = np.transpose(np.dot(np.linalg.inv(k), p2d) * depth_image).reshape(-1, 3) # Left coord to right point_cloud = p3d point_cloud[:, 1] = -point_cloud[:, 1] ones = np.ones((point_cloud.shape[0], 1)) homogenous_coordinate = np.hstack((point_cloud, ones)) transformed_points = (transform_matrix @ homogenous_coordinate.T).T return transformed_points[:, :3] frames = data["frames"] for frame in frames[:10]: rel_img_path = frame["file_path"] transform_matrix = frame["transform_matrix"] transform_matrix = np.array(transform_matrix) abs_img_path = os.path.abspath(os.path.join(pose_path, rel_img_path)).replace("/pose", "") abs_depth_path = abs_img_path.replace("small_city", "small_city_depth",1).replace("small_city_road_down", "small_city_road_down_depth").replace("png", "exr") if not os.path.exists(abs_img_path) or not os.path.exists(abs_depth_path): continue depth_image = cv2.imread(abs_depth_path, cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)[..., 0] color_image = cv2.imread(abs_img_path) color_image = cv2.cvtColor(color_image, cv2.COLOR_BGR2RGB) point_cloud = depth_to_pc(abs_depth_path, camera_intrinsics, transform_matrix) all_points.append(point_cloud) colors = color_image.reshape(-1, 3)/255.0 all_colors.append(colors) merged_points = np.vstack(all_points) merged_colors = np.vstack(all_colors) final_pcd = o3d.geometry.PointCloud() final_pcd.points = o3d.utility.Vector3dVector(merged_points) final_pcd.colors = o3d.utility.Vector3dVector(merged_colors) o3d.io.write_point_cloud("final_merged_point_cloud_2.pcd", final_pcd)
I think you wanna do
point_cloud[:, 2] = -point_cloud[:, 2]
For me, that makes the points align as expected
I am trying to fuse serval RGB and depth pairs to generate the pointclouds. Here is my core code and please feel free to point out if there is any problems. According to the instruction from YiXuan and previous post, I divided the depth with 10000 and did the same coordinate conversion. But the results seems wired.
Hi, have you solve this problem? I am trying to build point cloud too, but face the same problem as you.
I think you wanna do
point_cloud[:, 2] = -point_cloud[:, 2]
For me, that makes the points align as expected
Hi, would you mind sharing more details about your implementation? I tried point_cloud[:, 2] = -point_cloud[:, 2]
but it doesn't work.
I think you wanna do
point_cloud[:, 2] = -point_cloud[:, 2]
For me, that makes the points align as expected
Thanks, with below operations, the points align successfully:
point_cloud[:, 1] = -point_cloud[:, 1]
point_cloud[:, 2] = -point_cloud[:, 2]
Thank you very much for you question! I think the way you read depth is right. We have checked our depth map and find the reason. For the best rendering quality, we use the anti-alias setting to reduce noise, which will have a smoothing effect on the image. The depths change dramatically at the edge of the object and anti-alias reduces this change, which will cause the noisy point cloud. The depth map without and with anti-alias are shown below, which is a zoom-in pixel-level comparison.
We think the depth maps rendered with anti-lias can better align the rgb images rendered with anti-alias and can support most of the requirements, like depth guidance for reconstruction. But for generating point cloud from depth map, anti-alias will bring many noises like you find. We have tried generating point cloud from the depth rendered without anti-alias and solved the problem. We will update a new depth map version without anti-alias soon.
Hi, is there any update on the ground-truth depth map? It seems that the update date of "small_city_depth" is still 3 months ago. Besides, has the anti-alias problem been solved in the "big_city_depth"?
Thank you very much for you question! I think the way you read depth is right. We have checked our depth map and find the reason. For the best rendering quality, we use the anti-alias setting to reduce noise, which will have a smoothing effect on the image. The depths change dramatically at the edge of the object and anti-alias reduces this change, which will cause the noisy point cloud. The depth map without and with anti-alias are shown below, which is a zoom-in pixel-level comparison. We think the depth maps rendered with anti-lias can better align the rgb images rendered with anti-alias and can support most of the requirements, like depth guidance for reconstruction. But for generating point cloud from depth map, anti-alias will bring many noises like you find. We have tried generating point cloud from the depth rendered without anti-alias and solved the problem. We will update a new depth map version without anti-alias soon.
Hi, is there any update on the ground-truth depth map? It seems that the update date of "small_city_depth" is still 3 months ago. Besides, has the anti-alias problem been solved in the "big_city_depth"?
Currently, we do not provide the depth map without anti-alias. If you want to use the depth map to guide reconstruction, I think the depth map with anti-alias is more suitable because it aligns with rgb images. If you want to use the depth map to obtain geometry, you can refer to https://github.com/city-super/MatrixCity/blob/main/MatrixCityPlugin/docs/Render-Data.md#depth to render the depth map without anti-alias, we have already provided the camera sequences, render config, plugin and detailed documentation.
we have updated the script extracting point clouds from multi rgb-depth pairs at https://github.com/city-super/MatrixCity/blob/main/scripts/rgbd2pc.py @ChaoyiZh @yzslab @Goochaozheng @Snosixtyboo
Thank you for your excellent work!
I've downloaded the test set from the website and converted the depth image into point cloud using the following code:
The shape of the point cloud seemed nice, however there seemed to be a noticeable amount of noise in the point cloud.
Is this normal or am I using a wrong way to convert the exr file? I'm opening
0000.exr
fromsmall_city_road_horizon_test_depth.tar
.