naver / mast3r

Grounding Image Matching in 3D with MASt3R
Other
1.3k stars 97 forks source link

assert (center_depth > 0).all() #15

Open skalien opened 3 months ago

skalien commented 3 months ago

Hi, I am getting this assertion error. Can you please tell me what this error means and is this a bug in the program or something else? Thanks.

Traceback (most recent call last):
  File "/scratch/user/.local/lib/python3.10/site-packages/gradio/queueing.py", line 536, in process_events
    response = await route_utils.call_process_api(
  File "/scratch/user/.local/lib/python3.10/site-packages/gradio/route_utils.py", line 276, in call_process_api
    output = await app.get_blocks().process_api(
  File "/scratch/user/.local/lib/python3.10/site-packages/gradio/blocks.py", line 1897, in process_api
    result = await self.call_function(
  File "/scratch/user/.local/lib/python3.10/site-packages/gradio/blocks.py", line 1483, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/scratch/user/.local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/scratch/user/.local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread
    return await future
  File "/scratch/user/.local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 859, in run
    result = context.run(func, *args)
  File "/scratch/user/.local/lib/python3.10/site-packages/gradio/utils.py", line 816, in wrapper
    response = f(*args, **kwargs)
  File "/scratch/user/mast3r/demo.py", line 205, in get_reconstructed_scene
    scene = sparse_global_alignment(
  File "/scratch/user/mast3r/mast3r/cloud_opt/sparse_ga.py", line 159, in sparse_global_alignment
    tmp_pairs, pairwise_scores, canonical_views, canonical_paths, preds_21 = prepare_canonical_data(
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/scratch/user/mast3r/mast3r/cloud_opt/sparse_ga.py", line 768, in prepare_canonical_data
    canon, canon2, cconf = canonical_view(ptmaps11, confs11, subsample, **kw)
  File "/scratch/user/mast3r/mast3r/cloud_opt/sparse_ga.py", line 901, in canonical_view
    assert (center_depth > 0).all()
AssertionError
yocabon commented 3 months ago

Hi, thanks for opening this issue. This indeed looks like a bug. I replaced the assert with clipping so that it doesn't crash. This is more a workaround rather than an actual fix. The code we pushed for the sparse global alignment is not yet final. We hope to release a cleaner version eventually.

skalien commented 3 months ago

Code goes through without error. But the results are not good. I tried it on a dataset on which COLMAP works, but with your revised code does not give as good of an output as COLMAP does. Looking at the losses, it seems that the optimization is stuck on some local minima. Is it possible to load COLMAP poses_bounds.npy file as initialization point for the optimization? or, can you take a look at the log and tell me if there are better hyperparameters that I should try? Thanks for your immense help.

Log:

   ~/mast3r  main  CUDA_VISIBLE_DEVICES=1 python demo.py \
                                  --weights checkpoints/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric.pth \
                                  --local_network \
                                  --server_port 6006 \ls
usage: mast3r demo [-h] [--local_network | --server_name SERVER_NAME] [--image_size {512,224}] [--server_port SERVER_PORT]
                   (--weights WEIGHTS | --model_name {MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric}) [--device DEVICE] [--tmp_dir TMP_DIR]
                   [--silent] [--share]
mast3r demo: error: unrecognized arguments: ls
   ~/mast3r  main  CUDA_VISIBLE_DEVICES=1 python demo.py \
                                  --weights checkpoints/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric.pth \
                                  --local_network \
                                  --server_port 6006 
... loading model from checkpoints/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric.pth
instantiating : AsymmetricMASt3R(enc_depth=24, dec_depth=12, enc_embed_dim=1024, dec_embed_dim=768, enc_num_heads=16, dec_num_heads=12, pos_embed='RoPE100',img_size=(512, 512), head_type='catmlp+dpt', output_mode='pts3d+desc24', depth_mode=('exp', -inf, inf), conf_mode=('exp', 1, inf), patch_embed_cls='PatchEmbedDust3R', two_confs=True, desc_conf_mode=('exp', 0, inf), landscape_only=False)
<All keys matched successfully>
Outputing stuff in /tmp/tmpc146n_52_mast3r_gradio_demo
Running on local URL:  http://0.0.0.0:6006

To create a public link, set `share=True` in `launch()`.
>> Loading a list of 20 images
 - adding /tmp/gradio/d4a5cebb09915dbb77ff519a92697e334c771599/image010.goosegro_camera1A.aovs.09_06.DENOISE.001.jpg with resolution 1536x960 --> 512x320
 - adding /tmp/gradio/fa754af3da50d67cf4af5d0235170d0e0abfd1ca/image010.goosegro_camera1B.aovs.09_06.DENOISE.001.jpg with resolution 1536x960 --> 512x320
 - adding /tmp/gradio/9661e529bd120d6fb303e0e91fab23c58b27fcd0/image010.goosegro_camera1C.aovs.09_06.DENOISE.001.jpg with resolution 1536x960 --> 512x320
 - adding /tmp/gradio/dfe36d0e9f2a9e623f1a6fcda2a9d3793da2ed18/image010.goosegro_camera1D.aovs.09_06.DENOISE.001.jpg with resolution 1536x960 --> 512x320
 - adding /tmp/gradio/fb80a28c1fc4143591d25c3fe380f5df253d3dbb/image010.goosegro_camera1E.aovs.09_06.DENOISE.001.jpg with resolution 1536x960 --> 512x320
 - adding /tmp/gradio/e56e37f05aacad581f29b5c77cc7070219fc843e/image010.goosegro_camera1F.aovs.09_06.DENOISE.001.jpg with resolution 1536x960 --> 512x320
 - adding /tmp/gradio/e3dc41b07aabd85a36c9cac8a1034669c59d7c6b/image010.goosegro_camera1G.aovs.09_06.DENOISE.001.jpg with resolution 1536x960 --> 512x320
 - adding /tmp/gradio/0e92e9ce82138ef5c330c7a23d51c8730c2a68fe/image010.goosegro_camera1H.aovs.09_06.DENOISE.001.jpg with resolution 1536x960 --> 512x320
 - adding /tmp/gradio/c75014d2951085fd734c00e222c22d325b2d8eba/image010.goosegro_camera1I.aovs.09_06.DENOISE.001.jpg with resolution 1536x960 --> 512x320
 - adding /tmp/gradio/3ffc1f0be19d8c24391fe54b5015f80c3d492a50/image010.goosegro_camera1J.aovs.09_06.DENOISE.001.jpg with resolution 1536x960 --> 512x320
 - adding /tmp/gradio/99bfc8e86700bae693de3732e3f879d2e289b30a/image010.goosegro_camera2A.aovs.09_06.DENOISE.001.jpg with resolution 1536x960 --> 512x320
 - adding /tmp/gradio/7c4728d3c6056794f6cf562361c091b0019cfaa1/image010.goosegro_camera2B.aovs.09_06.DENOISE.001.jpg with resolution 1536x960 --> 512x320
 - adding /tmp/gradio/b58d7772e8ae8bed85cf80cbbb2f98f9611385af/image010.goosegro_camera2C.aovs.09_06.DENOISE.001.jpg with resolution 1536x960 --> 512x320
 - adding /tmp/gradio/a15b74990e7d2fd34d3347c36943e93e569e508f/image010.goosegro_camera2D.aovs.09_06.DENOISE.001.jpg with resolution 1536x960 --> 512x320
 - adding /tmp/gradio/158450a4d92aa85ca6a7209a167130228f8f0e69/image010.goosegro_camera2E.aovs.09_06.DENOISE.001.jpg with resolution 1536x960 --> 512x320
 - adding /tmp/gradio/ba5be2730c62d4587bcb0278b61b63ebb501b579/image010.goosegro_camera2F.aovs.09_06.DENOISE.001.jpg with resolution 1536x960 --> 512x320
 - adding /tmp/gradio/2e125d1c23ceb22fb1b285cfa0c5b723403e3e32/image010.goosegro_camera2G.aovs.09_06.DENOISE.001.jpg with resolution 1536x960 --> 512x320
 - adding /tmp/gradio/9050a8f0cdbf83133fb630d01db1d043f8e8369a/image010.goosegro_camera2H.aovs.09_06.DENOISE.001.jpg with resolution 1536x960 --> 512x320
 - adding /tmp/gradio/488986c82d5009db29572ea76140384dae4a8cf1/image010.goosegro_camera2I.aovs.09_06.DENOISE.001.jpg with resolution 1536x960 --> 512x320
 - adding /tmp/gradio/b3118dd2dc5299af395391388897df080d9685b3/image010.goosegro_camera2J.aovs.09_06.DENOISE.001.jpg with resolution 1536x960 --> 512x320
 (Found 20 images)
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 380/380 [01:11<00:00,  5.29it/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:04<00:00,  4.76it/s]
init focals = [221.70247 221.70247 221.70247 337.40414 360.13602 328.17224 437.2105
 473.91724 347.58273 221.70247 221.70247 251.29607 221.70247 221.70247
 258.71204 221.70247 221.70247 221.70247 376.61337 221.70247]
100%|█████████████████████████████████████████████████████████████████████████████████████| 500/500 [01:03<00:00,  7.91it/s, lr=0.0000, loss=0.161]
>> final loss = 0.16088540852069855
100%|█████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:24<00:00,  8.07it/s, lr=0.0000, loss=1.538]
>> final loss = 1.5383764505386353
Final focals = [230.36617 236.69899 234.3281  260.5374  216.10912 244.1593  269.19937
 316.50726 300.4955  272.22772 241.74031 250.68259 245.31836 231.41617
 210.68994 250.09065 278.99652 283.1394  288.26767 286.3053 ]
(exporting 3D scene to /tmp/tmpc146n_52_mast3r_gradio_demo/scene.glb )
yocabon commented 3 months ago

Right now, the sparse global alignement is hit or miss, we are still working on it. You can try putting matching_conf_thr to 0, it was meant to help with images that have 0 overlap but end up hurting the other scenarios. If your images are too close to one another, it can also cause issues.

ljjTYJR commented 3 months ago

Right now, the sparse global alignement is hit or miss, we are still working on it. You can try putting matching_conf_thr to 0, it was meant to help with images that have 0 overlap but end up hurting the other scenarios. If your images are too close to one another, it can also cause issues.

Hi, I have a question on the sparse global alignment. It seems that the optimization utilizes the 2D-2D correspondence to minimize the coarse 3D-3D loss and 2D-3D projection loss. The selected correspondence points are anchored to the "core depth" (which is uniformly sampled from the image to my understanding). But why not optimise those selected correspondence points directly? Why to optimize core depth?