colmap / pycolmap

Python bindings for COLMAP
BSD 3-Clause "New" or "Revised" License
858 stars 125 forks source link

confusion regarding image idx in fused.ply.vis #230

Closed shubham-monarch closed 5 months ago

shubham-monarch commented 5 months ago

I am reading the images.bin file inside the sparse folder using =>

sparse_img_dict = read_images_binary(sparse_images)
sparse_keys = list(sparse_img_dict.keys())
print(sparse_keys[:2]) =>
[29, 28]
[(29, Image(id=29, qvec=array([ 0.99998164,  0.0010699 , -0.00185604, -0.00566752]), tvec=array([-0.04931697,  1.16369875, -2.44931154]), camera_id=1, name='masked_images/34_left.jpg', xys=array([[ 377.75      ,   15.9375    ],
       [ 451.        ,   15.9375    ],
       [ 884.        ,   15.9375    ],
       [  89.0625    , 1058.5       ],
       [ 459.53585815, 1057.72924805],
       [ 361.        , 1060.5       ]]), point3D_ids=array([-1, -1, -1, ..., -1, -1, -1]))), (28, Image(id=28, qvec=array([ 9.99912806e-01,  5.89385605e-04, -1.12850933e-02, -6.83223274e-03]), tvec=array([-0.02267051,  0.85687406, -1.81195571]), camera_id=1, name='masked_images/33_right.jpg', xys=array([[  62.70589066,   14.72816944],
       [ 862.38500977,   16.50316048],
       [ 918.        ,   15.9375    ],
       [ 332.01391602, 1053.31433105],
       [  89.0625    , 1056.5       ],
       [1850.07897949, 1061.94226074]]), point3D_ids=array([-1, -1, -1, ..., -1, -1, -1])))]

As we can see the 'first' two keys (images) in the sparse_img_dict have the ids 29 and 28 respectively.

I, then, perform dense reconstruction and use the read_write_fused_vis.read_fused function to read the fused.ply and fused.ply.vis files as follows =>

# dense_dir => path to the directory containing the fused.ply and fused.ply.vis files
dense_ply = dense_dir / "fused.ply"
dense_ply_vis = dense_dir / "fused.ply.vis"
dense_ply_model = read_fused(dense_ply.as_posix(), dense_ply_vis.as_posix()) 
pt3D= dense_ply_model[0]
MeshingPoint(position=array([ -6.607655, -19.345737,  37.4368  ], dtype=float32), color=array([76, 99, 97], dtype=uint8), normal=array([-4.8142663e-04, -8.7308598e-01, -4.8756602e-01], dtype=float32), num_visible_images=1, visible_image_idxs=array([ 7]))

Now, we can see that the visible image indices of pt3D is 7. Which of these 2 cases does this imply? Does this mean that the target image has the id 7 i.e. target_img = sparse_img_dict[7] Or does this mean that the target image is the 8th (7 + 1) element in the sparse_img_dict i.e. target_img = sparse_img_dict[sparse_keys[7]]?

sarlinpe commented 5 months ago

These are really image indices and not IDs: I don't know if they are indices in the list of sorted IDs or instead the order of the IDs on file (in case they're not sorted):

image_id = sorted(rec.images.keys())[image_idx]
# or
image_id = list(rec.images.keys())[image_idx]
shubham-monarch commented 5 months ago

Makes sense. Thanks.