Open Hellsice opened 8 months ago
In this example, I define the voxel grid to enclose the object because I know approximately its location and size. If it's not your case, you can use a coarse-to-fine approach in which you start with a large grid (with large voxels) to get a first idea of the object location and size. And then, you use a smaller grid around the object with smaller voxels to get a finer reconstruction.
uvs are the coordinates of the 3D points once projected in the image. The division is done because I'm using perspective projection (with the classical pinhole camera model). If the coordinates falls outside the image, it can mean that:
The projection matrix combines both the intrinsic and extrinsic camera parameters. The intrinsic parameters can indeed be obtained using the chessboard calibration technique. Then, you also need the extrinsic parameters, i.e the position and orientation of the camera in the world. For that, you can either:
2/3. i'm using a SIFT detector with a flann based matcher to find similar points between images and calculate the essential matrix from those. Which seems to be what you meant with the second point. I also compared your projection matrices with mine, and noticed that your first projection matrix isn't the equivalent of the camera matrix with a column of zeroes added to it. Which makes me think that i may have made a mistake with the camera poses. Though the first image should not have a rotation and translation matrix right? as it is the image that defined the coordinate systems?
2/3. You can extract a relative pose from the essential matrix up to a scale. That means you can obtain relative pose for image pairs but you won't be able to merge them directly as they all have their own arbitrary scale. I would suggest you use a sfm tool like Colmap to compute your image poses.
Note: You cannot directly compare the projection matrices, as the camera poses are expressed in a certain reference coordinate system. For example, in this example the reference coordinate system is NOT the first camera pose. That is why its pose is not [I 0].
thanks for the tip. I tried using colmap, though i have noticed that exporting its results does not work perfectly, My volume calibration images work perfectly, but the other image sets give incorrect poses. I have also tried using your images and used colmap for the camera calibration and pose estimation, but i got no results from that either.
And as i'm trying to find the volume through a point cloud, i also tried using the point clouds from colmap, though that has a few noise points which makes volume estimation hugely incorrect, unless i can filter out said noise. Thus i'm sticking with what you made, though i can't get it to work at all. I uploaded what i have to my own repository, would you be willing to take a look?
Thanks for sharing the code. I'm currently trying to apply if for an aquaponics system to determine the volume of plants. but i'm stuck on the actual space carving part.