lpiccinelli-eth / UniDepth

Universal Monocular Metric Depth Estimation
Other
473 stars 39 forks source link

Load model from local #12

Closed shuai-dian closed 3 months ago

shuai-dian commented 3 months ago

I'm trying to load a model from local

from unidepth.models import UniDepthV1

with open("weights/config_v1_vitl14.json") as f:
    config = json.load(f)

model = UniDepthV1.build(config)
path = "/workspace/UniDepth/weights/unidepth_v1_vitl14.bin"
mod = torch.load(path)

info = model.load_state_dict(mod, strict=False)

But the following problem occurred,can you help me

Traceback (most recent call last):
  File "/workspace/UniDepth/demo.py", line 16, in <module>
    info = model.load_state_dict(mod, strict=False)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2153, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for UniDepthV1:
        size mismatch for pixel_encoder.mask_token: copying a param with shape torch.Size([1, 1024]) from checkpoint, the shape in current model is torch.Size([1, 192, 1, 1]).
        size mismatch for pixel_decoder.input_adapter.input_adapters.0.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([192]).
        size mismatch for pixel_decoder.input_adapter.input_adapters.0.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([192]).
        size mismatch for pixel_decoder.input_adapter.input_adapters.0.1.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([512, 192]).
        size mismatch for pixel_decoder.input_adapter.input_adapters.1.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]).
        size mismatch for pixel_decoder.input_adapter.input_adapters.1.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]).
        size mismatch for pixel_decoder.input_adapter.input_adapters.1.1.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([512, 384]).
        size mismatch for pixel_decoder.input_adapter.input_adapters.2.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
        size mismatch for pixel_decoder.input_adapter.input_adapters.2.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
        size mismatch for pixel_decoder.input_adapter.input_adapters.2.1.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([512, 768]).
        size mismatch for pixel_decoder.input_adapter.input_adapters.3.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([1536]).
        size mismatch for pixel_decoder.input_adapter.input_adapters.3.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([1536]).
        size mismatch for pixel_decoder.input_adapter.input_adapters.3.1.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1536]).
        size mismatch for pixel_decoder.token_adapter.input_adapters.0.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([1536]).
        size mismatch for pixel_decoder.token_adapter.input_adapters.0.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([1536]).
        size mismatch for pixel_decoder.token_adapter.input_adapters.0.1.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1536]).
        size mismatch for pixel_decoder.token_adapter.input_adapters.1.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([1536]).
        size mismatch for pixel_decoder.token_adapter.input_adapters.1.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([1536]).
        size mismatch for pixel_decoder.token_adapter.input_adapters.1.1.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1536]).
        size mismatch for pixel_decoder.token_adapter.input_adapters.2.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([1536]).
        size mismatch for pixel_decoder.token_adapter.input_adapters.2.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([1536]).
        size mismatch for pixel_decoder.token_adapter.input_adapters.2.1.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1536]).
        size mismatch for pixel_decoder.token_adapter.input_adapters.3.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
        size mismatch for pixel_decoder.token_adapter.input_adapters.3.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
        size mismatch for pixel_decoder.token_adapter.input_adapters.3.1.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([512, 768]).
lpiccinelli-eth commented 3 months ago

It seems like the model built, i.e. from the .json used as config, has ConvNext Large as the backbone, but the checkpoint downloaded corresponds to ViT-L backbone.

lpiccinelli-eth commented 3 months ago

I close the issue due to inactivity, if you have any more questions, feel free to reopen it!