some more clarifications about lines 54-60

baishali1986 commented 1 year ago

Hi

Thanks for sharing the code . I have some more clarifications about the code.

Why did you choose s=120 did you play around with other numbers. I ask because if I want to initialize a voxel grid for my data how do I do that so that it encloses the object. Any tips or educated procedure that I should follow?
Why are the x,y,z normalized by max values and why are then the center point subtracted . Can you explain the rationale
Why were the points divided by 5 in line 59
Again about line 60 how did you decide on the number 0.62. I am asking so that i can do the same with my images.
Another question about camera parameters , can you point to the source of the camera parameters. is the camera parameters the projection matrix which is K*[R |t]

Thanks a lot in advance

zinsmatt commented 1 year ago

Hi,

's' is the resolution of the voxel grid (s x s x s). A high value can produce a more detailed reconstruction but also requires more memory. You can start with a lower value and then increase it gradually.
(3.4.) The normalization and centering are used to obtain a normalized and centered grid with coordinates between -0.5 and 0.5 along each direction. Then, the division by 5 and -0.62 are use to adjust the grid to the reconstructed object. (The grid needs to enclose the object or the scene you want to reconstruct in 3D). I fine-tuned these values for this particular dataset by using a "test-and-repeat" strategy.
The camera parameters are loaded from the 'data/dino-Ps.mat' file. They are expressed by a 3x4 projection matrix P which is effectively equal to K*[R |t].

I hope it helps.

baishali1986 commented 1 year ago

Hi thanks for the reply. I had some uncalibrated images and then used colmap to get the camera poses i.e K,R, T . But now I dont understand how to use this data to adjust the voxel grid for my dataset. do you have any insight as to should i check if the grid encloses the object or scene.

baishali1986 commented 1 year ago

by the way do you happen to know the coordinate system in which the camera poses were captured. For example for colmap x axis is right, y axis is down and zaxis faces forward. Also I recorded the translation values forall the camera poses from the 33 images in my dataset. The x limit, ylimit and zlimit are = [-0.8478055162041223, 3.5431265043344955] [-4.508484764391479, 0.6116534483103162] [1.2852357676393724, 5.417266526229056] . How can I use these numbers to initialize a voxel grid?

zinsmatt commented 1 year ago

The cameras poses should be expressed in a global coordinate system. The one chosen by colmap should be fine. Yes, you can use the camera positions to initialize the grid if you are in a situation where the camera is orbiting around the object you want to reconstruct. You can start by defining a grid that fully encloses the camera position. Or you could also use the sparse point reconstruction created by colmap if you have access to it.

baishali1986 commented 1 year ago

how should i use the sparse point reconstruction for voxel grid initialization

zinsmatt commented 1 year ago

Their min/max coordinates along x, y and z axes could be used to define the grid boundaries.

zinsmatt / SpaceCarving

some more clarifications about lines 54-60 #9