YvanYin / Metric3D

The repo for "Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image" and "Metric3Dv2: A Versatile Monocular Geometric Foundation Model..."
https://jugghm.github.io/Metric3Dv2/
Creative Commons Zero v1.0 Universal
973 stars 70 forks source link

Some problems in Training #101

Open DaDaDaBiuBiu opened 1 month ago

DaDaDaBiuBiu commented 1 month ago

Hello, thanks for sharing, great work!

i want to train my own network following the tutorial in ./training/README.md , but there were some problems

  1. Could not download the pretrained checkpoints in _data_server_info/pretrainedweight.py, is there any links on cloudstor?
  2. How to generate the required json or pkl from downloaded open source data ?
  3. When using sparse point cloud data, how should training be conducted? Does it need to be processed into dense depth information and stored in a PNG file?
Xbinzhao commented 4 weeks ago

I have the following error when using the provided pre-trained model "metric_depth_vit_small_800k.pth" :

Missing key(s) in state_dict: "cls_token", "pos_embed", "register_tokens", "mask_token", "patch_embed.proj.weight", "patch_embed.proj.bias", "blocks.0.0.norm1.weight", "blocks.0.0.norm1.bias", "blocks.0.0.attn.qkv.weight", "blocks.0.0.attn.qkv.bias", "blocks.0.0.attn.proj.weight", "blocks.0.0.attn.proj.bias", "blocks.0.0.ls1.gamma", "blocks.0.0.norm2.weight", "blocks.0.0.norm2.bias", "blocks.0.0.mlp.fc1.weight", "blocks.0.0.mlp.fc1.bias", "blocks.0.0.mlp.fc2.weight", "blocks.0.0.mlp.fc2.bias", "blocks.0.0.ls2.gamma", "blocks.0.1.norm1.weight", "blocks.0.1.norm1.bias", "blocks.0.1.attn.qkv.weight", "blocks.0.1.attn.qkv.bias", "blocks.0.1.attn.proj.weight", "blocks.0.1.attn.proj.bias", "blocks.0.1.ls1.gamma", "blocks.0.1.norm2.weight", "blocks.0.1.norm2.bias", "blocks.0.1.mlp.fc1.weight", "blocks.0.1.mlp.fc1.bias", "blocks.0.1.mlp.fc2.weight", "blocks.0.1.mlp.fc2.bias", "blocks.0.1.ls2.gamma", "blocks.0.2.norm1.weight", "blocks.0.2.norm1.bias", "blocks.0.2.attn.qkv.weight", "blocks.0.2.attn.qkv.bias", "blocks.0.2.attn.proj.weight", "blocks.0.2.attn.proj.bias", "blocks.0.2.ls1.gamma", "blocks.0.2.norm2.weight", "blocks.0.2.norm2.bias", "blocks.0.2.mlp.fc1.weight", "blocks.0.2.mlp.fc1.bias", "blocks.0.2.mlp.fc2.weight", "blocks.0.2.mlp.fc2.bias", "blocks.0.2.ls2.gamma", "blocks.0.3.norm1.weight", "blocks.0.3.norm1.bias", "blocks.0.3.attn.qkv.weight", "blocks.0.3.attn.qkv.bias", "blocks.0.3.attn.proj.weight", "blocks.0.3.attn.proj.bias", "blocks.0.3.ls1.gamma", "blocks.0.3.norm2.weight", "blocks.0.3.norm2.bias", "blocks.0.3.mlp.fc1.weight", "blocks.0.3.mlp.fc1.bias", "blocks.0.3.mlp.fc2.weight", "blocks.0.3.mlp.fc2.bias", "blocks.0.3.ls2.gamma", "blocks.0.4.norm1.weight", "blocks.0.4.norm1.bias", "blocks.0.4.attn.qkv.weight", "blocks.0.4.attn.qkv.bias", "blocks.0.4.attn.proj.weight", "blocks.0.4.attn.proj.bias", "blocks.0.4.ls1.gamma", "blocks.0.4.norm2.weight", "blocks.0.4.norm2.bias", "blocks.0.4.mlp.fc1.weight", "blocks.0.4.mlp.fc1.bias", "blocks.0.4.mlp.fc2.weight", "blocks.0.4.mlp.fc2.bias", "blocks.0.4.ls2.gamma", "blocks.0.5.norm1.weight", "blocks.0.5.norm1.bias", "blocks.0.5.attn.qkv.weight", "blocks.0.5.attn.qkv.bias", "blocks.0.5.attn.proj.weight", "blocks.0.5.attn.proj.bias", "blocks.0.5.ls1.gamma", "blocks.0.5.norm2.weight", "blocks.0.5.norm2.bias", "blocks.0.5.mlp.fc1.weight", "blocks.0.5.mlp.fc1.bias", "blocks.0.5.mlp.fc2.weight", "blocks.0.5.mlp.fc2.bias", "blocks.0.5.ls2.gamma", "blocks.0.6.norm1.weight", "blocks.0.6.norm1.bias", "blocks.0.6.attn.qkv.weight", "blocks.0.6.attn.qkv.bias", "blocks.0.6.attn.proj.weight", "blocks.0.6.attn.proj.bias", "blocks.0.6.ls1.gamma", "blocks.0.6.norm2.weight", "blocks.0.6.norm2.bias", "blocks.0.6.mlp.fc1.weight", "blocks.0.6.mlp.fc1.bias", "blocks.0.6.mlp.fc2.weight", "blocks.0.6.mlp.fc2.bias", "blocks.0.6.ls2.gamma", "blocks.0.7.norm1.weight", "blocks.0.7.norm1.bias", "blocks.0.7.attn.qkv.weight", "blocks.0.7.attn.qkv.bias", "blocks.0.7.attn.proj.weight", "blocks.0.7.attn.proj.bias", "blocks.0.7.ls1.gamma", "blocks.0.7.norm2.weight", "blocks.0.7.norm2.bias", "blocks.0.7.mlp.fc1.weight", "blocks.0.7.mlp.fc1.bias", "blocks.0.7.mlp.fc2.weight", "blocks.0.7.mlp.fc2.bias", "blocks.0.7.ls2.gamma", "blocks.0.8.norm1.weight", "blocks.0.8.norm1.bias", "blocks.0.8.attn.qkv.weight", "blocks.0.8.attn.qkv.bias", "blocks.0.8.attn.proj.weight", "blocks.0.8.attn.proj.bias", "blocks.0.8.ls1.gamma", "blocks.0.8.norm2.weight", "blocks.0.8.norm2.bias", "blocks.0.8.mlp.fc1.weight", "blocks.0.8.mlp.fc1.bias", "blocks.0.8.mlp.fc2.weight", "blocks.0.8.mlp.fc2.bias", "blocks.0.8.ls2.gamma", "blocks.0.9.norm1.weight", "blocks.0.9.norm1.bias", "blocks.0.9.attn.qkv.weight", "blocks.0.9.attn.qkv.bias", "blocks.0.9.attn.proj.weight", "blocks.0.9.attn.proj.bias", "blocks.0.9.ls1.gamma", "blocks.0.9.norm2.weight", "blocks.0.9.norm2.bias", "blocks.0.9.mlp.fc1.weight", "blocks.0.9.mlp.fc1.bias", "blocks.0.9.mlp.fc2.weight", "blocks.0.9.mlp.fc2.bias", "blocks.0.9.ls2.gamma", "blocks.0.10.norm1.weight", "blocks.0.10.norm1.bias", "blocks.0.10.attn.qkv.weight", "blocks.0.10.attn.qkv.bias", "blocks.0.10.attn.proj.weight", "blocks.0.10.attn.proj.bias", "blocks.0.10.ls1.gamma", "blocks.0.10.norm2.weight", "blocks.0.10.norm2.bias", "blocks.0.10.mlp.fc1.weight", "blocks.0.10.mlp.fc1.bias", "blocks.0.10.mlp.fc2.weight", "blocks.0.10.mlp.fc2.bias", "blocks.0.10.ls2.gamma", "blocks.0.11.norm1.weight", "blocks.0.11.norm1.bias", "blocks.0.11.attn.qkv.weight", "blocks.0.11.attn.qkv.bias", "blocks.0.11.attn.proj.weight", "blocks.0.11.attn.proj.bias", "blocks.0.11.ls1.gamma", "blocks.0.11.norm2.weight", "blocks.0.11.norm2.bias", "blocks.0.11.mlp.fc1.weight", "blocks.0.11.mlp.fc1.bias", "blocks.0.11.mlp.fc2.weight", "blocks.0.11.mlp.fc2.bias", "blocks.0.11.ls2.gamma", "norm.weight", "norm.bias". Unexpected key(s) in state_dict: "model_state_dict".

Kevin-Miao commented 2 weeks ago

+1, commenting here for visibility

Kevin-Miao commented 2 weeks ago

Figured it out; the ckpts above refer to the DINO checkpoints. I commented those lines out and moved the torch.load and load_state_dict clause in monodepth_model.py.