Incorrect reconstruction result #2

Closed jly0810 closed 1 year ago

jly0810 commented 1 year ago

Excuse me, the following is the result of reconstruction using my dataset. The dataset includes three groups of RGBD images, but the reconstruction results do not coincide well. What is the reason? Is the camera in your sample dataset fixed, rotating only by people? Is this necessary? image image image The following is the original picture(RGB picture): A_hat1_color_frame0 A_hat1_color_frame1 A_hat1_color_frame2

I hope you can solve my problem. It has troubled me for a long time. Thank you!!

Ritchizh commented 1 year ago

Hi @jly0810! In my experiment the camera was fixed, and only the person was rotating. However, you can do the opposite: the person can sit fixed, while the camera is moved around him/her. 1) The main requirement is that change between adjacent frames should be small. Do you skip frames with parameter https://github.com/Ritchizh/RGBD-Integration-2020/blob/42cb66827580d2376ffcee5b60cd765e10ef2b5b/main__TSDF_Integrate__color_depth.py#L27 ? If yes, try decreasing it. In the frames above the change seems too large. 2) The main issue with your frames, as far as I can see, is that you skipped segmentation of the subject step. You should delete the background behind your subject. I haven't published the code for segmentation. For every depth frame you should remove all pixels with distance value larger than a threshold. Or, alternatively, for each point cloud you can delete points with z coordinate larger than a threshold.

Ritchizh commented 1 year ago

Have you tried anything yet? 1) You can try varying distance truncation parameter here: https://github.com/Ritchizh/RGBD-Integration-2020/blob/42cb66827580d2376ffcee5b60cd765e10ef2b5b/main__TSDF_Integrate__color_depth.py#L96 2) I looked into the project, two years have passed and I can't remember which of the segmentation scripts I have used 😅 So I'll upload one that seems to be right. It is based on the fact that before capturing the subject, a background .bag is recorded.

jly0810 commented 1 year ago

Thank you for your reply.

  1. My dataset should not meet your requirement that there is little change between adjacent frames, or is there any specific standard? Does this condition mainly affect ICP operation? But,I never skip frames, that is, I always keep “skip_N_frames = 1” 2.Does your code require background culling? Is this a requirement? My goal is not limited to the reconstruction of portraits, so background removal is not necessary for me. And from my reconstruction results, I don't think whether background removal is the cause of incorrect results. It is more likely to be caused by incorrect external parameters. I'm not sure if my idea is correct, I hope you can correct it! I did not operate in this step, but I modified the truncation value, and the reconstruction result is still not correct.

Thanks again for your answer, thank you!

Ritchizh commented 1 year ago

1) If you look into ICP definition, you can see that it tries to find a rigid transformation (translation+rotation) that would tightly match 2 point clouds in space. So, the closer your adjacent point clouds - the easier it is to find this transform. You can try to tune ICP function parameters to make the alignment converge. ICP tries to find matching point pairs in the two point clouds (based on various criteria - the closest point; source point's normal ray intersection with destination surface etc). This means that the clearer are the tracked objects - the better. If you bring along the background wall plane it will surely affect alignment. In the Open3d tutorial example they have a point cloud of a chair with wall background - my guess is: it will work properly only if you move the camera, but don't move the chair relative to the wall. If the chair is moved - it is not clear what objects to match: either align the walls in 2 frames, or align the chairs.

2) However, the first step of the algorithm is rough point clouds alignment with RANSAC: http://www.open3d.org/docs/release/tutorial/pipelines/global_registration.html?highlight=ransac Only after it, the more delicate ICP is used. I would recommend you take 2 of your point clouds and run RANSAC example on them from the link above - and see if they can be aligned.

3) What sensor do you use to record data?

jly0810 commented 1 year ago
1、In my dataset, only the camera is moved, and objects in the scene do not move relative. The chair and the figure are regarded as one object, so I think there is no problem that you don't know which object to match 2、I'll try this later 3、The data set above is obtained in blender. In fact, I also got the external parameters of the camera. I tried in the following project, (https://github.com/andyzeng/tsdf-fusion-python)and the reconstruction results did not overlap. I studied for a long time and did not find the problem. I always thought it was the external parameters of the camera, so I found your code to try.In the previous work, I used the external parameters obtained from calibration as input. I also have the data captured by realsense, which can get correct results in the above project, but the results are incorrect here.

Ritchizh commented 1 year ago
  1. It is strange that your data captured by RealSense fails here (is it RealSense D435?) Have you checked whether the intrinsics of your RealSense camera are same as mine? https://github.com/Ritchizh/RGBD-Integration-2020/blob/84f4e4fe8d8fa6e5deec50c3d16df5ebaf4de707/main__TSDF_Integrate__color_depth.py#L103-L108
color_stream = profile.get_stream(realsense.stream.color);
color_video_stream = rgb_stream .as('video_stream_profile');
color_intrinsic=depth_aligned_to_color_intrinsic = color_video_stream.get_intrinsic()

In this project it is assumed that you have aligned depth and color frames by means of pyrealsense when recording data (example), so extrinsic parameters are not needed. Only intrinsic parameters of the camera are used to create a point cloud.

jly0810 commented 1 year ago
  1. 奇怪的是你的RealSense捕获的数据在这里失败了(是RealSense D435吗?)你检查过你的RealSense相机的内在函数是否和我的一样? https://github.com/Ritchizh/RGBD-Integration-2020/blob/84f4e4fe8d8fa6e5deec50c3d16df5ebaf4de707/main__TSDF_Integrate__color_depth.py#L103-L108
color_stream = profile.get_stream(realsense.stream.color);
color_video_stream = rgb_stream .as('video_stream_profile');
color_intrinsic=depth_aligned_to_color_intrinsic = color_video_stream.get_intrinsic()

在这个项目中,假设您在记录数据时已经通过 pyrealsense 对齐了深度和颜色帧(示例),因此不需要外部参数。只有相机的内在参数用于创建点云。 Hello, I have a question . What does the pose matrix in this function, that is “volume.integrate(rgbd, cameraIntrinsics, camera_poses[num_cam_pose].pose) ”,the external parameter matrix of the camera(camera_poses[num_cam_pose].pose), represent? Does it mean that the camera to the world (in other words : the coordinates under the camera coordinate system=posethe coordinates under the world coordinate system ) or the world to the camera (the coordinates under the world coordinate system=posethe coordinates under the camera coordinate system )? I hope you can solve my doubts,thanks.

Ritchizh commented 1 year ago

Hi @jly0810 ! I'm looking into this project again, so I wondered: have you succeeded with adjusting the code to your data? If not, I could try to run it for you, if you share several (3-5) color+depth frames and camera intrinsics.

ethyd4 commented 1 year ago

Hello mam, I captured the data to reconstruct a model. Firstly, I kept the sensor constant and moved the object. At that time it gave descent results. But, On doing the vice-versa process it was not giving the accurate results. Which, means In the final output it was producing multiple instances of object. Can you suggest better way to do scanning while the object keep constant.

Using these registration method can we reconstruct the bigger objects like car , bike , and any other large objects.

Ritchizh commented 1 year ago

@ethyd4 Hello! You should create a new issue - in this branch it's offtopic. In the new issue please explain what "producing multiple instances of object" means - on what stage of the algorithm it happens and how does it look.