NVlabs / BundleSDF

[CVPR 2023] BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects
https://bundlesdf.github.io/
Other
992 stars 109 forks source link

Test BundleSDF with Synthetic dataset - discussion #140

Closed fedona closed 6 months ago

fedona commented 7 months ago

Hello, thank you for providing such an interesting repository, I have been testing a lot of videos and I am very fascinated by the pipeline :)

After have run the code with several real images, I have been trying to test BundleSDF on some custom synthetic data. So far I have obtained terrible results, and this is probably due to having a wrong camera calibration file (I am struggling to get it correct from the information that I am getting from blender's camera).

Screenshot from 2024-01-28 17-55-09 Here you can see the first reconstruction of a chair, obtained after ~30 frames.

Screenshot from 2024-01-28 17-57-11 Here is the final reconstruction, after ~300 frames

Is there some interesting suggestions that I can get from you to obtain better results (or maybe on how to get the calibration matrix from blender)? May I have your insights on testing BundleSDF on synthetic datasets?

Wouldn't be interesting to see how BundleSDF performs at the BOP challenge? Some datasets there have non-consecutive images (like Linemod) and I believe this is not great for this pipeline, but it might still be nice to see it, either officially or unofficially.

Thank you so much

kevinDrawn commented 7 months ago

Hello! Are you conducting experiments with data that you directly possess? That's impressive.

I'd like to try 3D reconstruction with my own data as well and was wondering if I could seek some advice.

The BundleSDF website mentioned that depth, color, and mask images in PNG format, along with the camera intrinsic matrix, should suffice. So, I prepared those and attempted to run it, but encountered an error.

I'm starting to think that perhaps point cloud data is necessary. The reason being, in the milk example, there was a point cloud text file named 13985298.ply. How did you prepare your input data?

fedona commented 7 months ago

Hello! Are you conducting experiments with data that you directly possess? That's impressive.

I'd like to try 3D reconstruction with my own data as well and was wondering if I could seek some advice.

The BundleSDF website mentioned that depth, color, and mask images in PNG format, along with the camera intrinsic matrix, should suffice. So, I prepared those and attempted to run it, but encountered an error.

I'm starting to think that perhaps point cloud data is necessary. The reason being, in the milk example, there was a point cloud text file named 13985298.ply. How did you prepare your input data?

Hey there!

In my knowledge there is no need for any point cloud file, I would suggest you to really check that all folders are named correctly as the milk example, that rgb files have the same name as their respective masks and depth images, and that you provide a copy of the first frame mask named "mask.png". What is really necessary is: /rgb /depth /masks /cam_K.txt and /mask.png

I am recording my own dataset using a Kinect Azure camera, and I transform the depth images to match the rgb images calibration. I get the masks using MiVOS because I had problems running the suggested method on my machine.

I suggest you to open a different issue for your own error message if you still do not find the solution to it. Have fun!

kevinDrawn commented 7 months ago

Oh!! Thank you for your help! I didn't locate the mask.png in the directory that you said. I will do that.

In my case, I got the masks images by using FastSAM Algorithm. Thank you !

wenbowen123 commented 7 months ago

BundleSDF should work pretty well on synthetic data. For debugging, to begin with, increase debug level higher for more verbose loggings. You can then check if the feature matching makes sense.

fedona commented 5 months ago

Hello, I have been testing BundleSDF on Blenderproc generated images and I still encounter problems.

Since I have been using a standard script for the bop challenge train data generation I believe the camera calibration matrix has to be correct. I modified the above script to generate contiguous images rather than scattered ones.

To make it run I have to set zfar = 7 (or more), even if the object is way closer in the captures; from the gui the pose axis are very small, forming a point, and at time it has to generate the first 3d reconstruction mesh it goes in a "limbo" saying "[bundlesdf.py] Getting mesh". Increasing the debug level has not been helpful.

Screenshot from 2024-03-12 10-52-22

When changing the camera calibration matrix increasing the focal length values I can see from the pose axis get bigger, and then unfortunately my gpu goes out of memory for generating the 3d mesh.

Is there available a synthetic dataset as example that I can test for BundleSDF? Something like the real milk dataset...

wenbowen123 commented 5 months ago

you shouldn't change the camera intrinsic as that will hurt the performance. If you want to increase the size of the viz axis, change the "scale" and "thickness" here.

Also, what size is the dinosaur, seems very big? There are some params in the current implementation that assumes the size of a common daily-life object. If your simulation asset is overly big, consider rescale it.