Closed raphaelsulzer closed 5 years ago
@raphaelsulzer The provided extrinsics define the mapping from world to camera coordinates (following the standard Computer Vision convention).
The depth maps don't have actual units. They are defined by the scale of a reconstruction.
The best description for converting depth map entries to 3D points can probably be found here: https://github.com/colmap/colmap/blob/d3a29e203ab69e91eda938d6e56e1c7339d62a99/src/mvs/fusion.cc#L216
@tsattler Thank you! It was fairly easy like that.
@tsattler Thanks for your answer. Does the convention you mentioned here refer to the right-hand rule where the axes of the world coordinate system follow [X,Y,Z] -> [right,up,backwards]
?
"The local camera coordinate system of an image is defined in a way that the X axis points to the right, the Y axis to the bottom, and the Z axis to the front as seen from the image." (see https://colmap.github.io/format.html#images-txt)
@tsattler Thanks. I had read the documentation but the camera extrinsics part was still not very clear to me. Does the camera coordinate look like the following figure? The gaze direction of the camera is the negative z axis.
Does the world coordinate system also follow ''the X axis points to the right, the Y axis to the bottom, and the Z axis to the front ''?
The coordinate system is the one commonly used in the computer vision literature, where the camera is looking down the z-axis.
@raphaelsulzer Hi, I alse met this problem, have you fixed this problem? I used function read_and_write_dense.py to get depth map and use interior parameters to convert depth map to camera coordinate. But seems someting wrong. Could you please tell me how you use the depth map.bin Thanks a lot
I used some c++ code in the end to do what I wanted to do. After linking the colmap library to your own c++ code it is fairly easy to load models, including depth maps etc. This code could be a good entry point: https://github.com/colmap/colmap/issues/820#issue-575611194
I would like to transform the (geometric) depth maps of a COLMAP project into one common coordinate system (I do not want to do a depth map fusion). For this I wrote the python script below, based on the python functions provided by COLMAP.
` cameras, images, points3D = read_model( "pathToModel/txt/", ".txt")
My main problem is that I am not clear about the interior and exterior camera orientation provided by COLMAP. As far as I understand, the quaternion and translation vector from the images.txt file gives me the exterior orientation of each image. So with this I can transform from world to camera coordinate system and vice versa. Now in the code above I am "measuring" pixels in the depth map in image space. So to first translate from image to camera space, I am moving the origin to the center of the image and multiplying by the pixel size. However, my final results are nonsense.
Is it correct what I am doing? What are the units of the depth in the depth maps? Where can I find the pixel size that COLMAP uses, to go from pixel to camera coordinates? How do I deal with the different size of depth maps and original images?