lpiccinelli-eth / UniDepth

Universal Monocular Metric Depth Estimation
Other
647 stars 52 forks source link

Error while downloading V2 model #62

Open kafkaGen opened 5 months ago

kafkaGen commented 5 months ago

Tried download V2 model with torch and hugging-face-like interface, both through same error: Downloading: "https://github.com/lpiccinelli-eth/UniDepth/zipball/main" to /home/olehb/.cache/torch/hub/main.zip Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/olehb/Projects/Gat/post-arrival-aircraft-walk-around-inspection-completed-accurately3/.venv/lib/python3.10/site-packages/torch/hub.py", line 566, in load model = _load_local(repo_or_dir, model, *args, **kwargs) File "/home/olehb/Projects/Gat/post-arrival-aircraft-walk-around-inspection-completed-accurately3/.venv/lib/python3.10/site-packages/torch/hub.py", line 595, in _load_local model = entry(*args, **kwargs) File "/home/olehb/.cache/torch/hub/lpiccinelli-eth_UniDepth_main/hubconf.py", line 33, in UniDepth info = model.load_state_dict(torch.load(path), strict=False) File "/home/olehb/Projects/Gat/post-arrival-aircraft-walk-around-inspection-completed-accurately3/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2153, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for UniDepthV2: size mismatch for pixel_decoder.depth_layer.ups.0.up.1.weight: copying a param with shape torch.Size([128, 1, 7, 7]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]). size mismatch for pixel_decoder.depth_layer.ups.0.up.1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for pixel_decoder.depth_layer.ups.1.up.1.weight: copying a param with shape torch.Size([64, 1, 7, 7]) from checkpoint, the shape in current model is torch.Size([128, 64, 3, 3]). size mismatch for pixel_decoder.depth_layer.ups.1.up.1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for pixel_decoder.depth_layer.ups.2.up.1.weight: copying a param with shape torch.Size([32, 1, 7, 7]) from checkpoint, the shape in current model is torch.Size([64, 32, 3, 3]). size mismatch for pixel_decoder.depth_layer.ups.2.up.1.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([64]).

Have no problems with V1 downloading.

Env: absl-py==2.1.0 aiohttp==3.9.5 aiohttp-retry==2.8.3 aiosignal==1.3.1 alphashape==1.3.1 amqp==5.2.0 antlr4-python3-runtime==4.9.3 appdirs==1.4.4 astunparse==1.6.3 async-timeout==4.0.3 asyncssh==2.14.2 atpublic==4.1.0 attrs==23.2.0 billiard==4.2.0 black==24.4.2 blosc2==2.6.2 botocore==1.34.54 CacheControl==0.14.0 cachetools==5.3.3 celery==5.4.0 certifi==2022.12.7 cffi==1.16.0 charset-normalizer==3.3.2 click==8.1.7 click-didyoumean==0.3.1 click-log==0.4.0 click-plugins==1.1.1 click-repl==0.3.0 cligj==0.7.2 colorama==0.4.6 commonmark==0.9.1 configobj==5.0.8 contourpy==1.2.1 cryptography==42.0.7 cycler==0.12.1 Cython==3.0.10 decorator==5.1.1 dictdiffer==0.9.0 diskcache==5.6.3 distro==1.9.0 docker-pycreds==0.4.0 dpath==2.1.6 dulwich==0.22.1 dvc==3.4.0 dvc-data==2.3.3 dvc-gs==2.22.1 dvc-http==2.32.0 dvc-objects==0.25.0 dvc-render==0.7.0 dvc-studio-client==0.20.0 dvc-task==0.4.0 easydict==1.13 einops==0.7.0 filelock==3.14.0 filterpy==1.4.5 fiona==1.9.6 firebase-admin==6.5.0 flake8==7.0.0 flake8-bugbear==24.2.6 flake8-comprehensions==3.14.0 flatbuffers==24.3.25 flatten-dict==0.4.2 flufl.lock==7.1.1 fonttools==4.52.4 frozenlist==1.4.1 fsspec==2024.5.0 funcy==2.0 fvcore==0.1.5.post20221221 gast==0.4.0 gcsfs==2024.5.0 geographiclib==2.0 geopandas==0.14.4 geopy==2.4.1 gitdb==4.0.11 GitPython==3.1.43 google-api-core==2.19.0 google-api-python-client==2.131.0 google-auth==2.29.0 google-auth-httplib2==0.2.0 google-auth-oauthlib==1.2.0 google-cloud-core==2.4.1 google-cloud-firestore==2.16.0 google-cloud-storage==2.16.0 google-crc32c==1.5.0 google-pasta==0.2.0 google-resumable-media==2.7.0 googleapis-common-protos==1.63.0 grandalf==0.8 grpcio==1.64.0 grpcio-status==1.48.2 h5py==3.11.0 httplib2==0.22.0 huggingface-hub==0.23.2 hydra-core==1.3.2 idna==3.7 imageio==2.34.1 imath==0.0.2 iopath==0.1.10 isort==5.13.2 iterative-telemetry==0.0.8 Jinja2==3.1.4 jmespath==1.0.1 kafka-python==2.0.2 keras==2.15.0 kiwisolver==1.4.5 kombu==5.3.7 lazy_loader==0.4 libclang==18.1.1 Markdown==3.6 markdown-it-py==3.0.0 MarkupSafe==2.1.5 matplotlib==3.8.4 mccabe==0.7.0 mdurl==0.1.2 ml-dtypes==0.3.2 mobile-sam @ git+https://github.com/ChaoningZhang/MobileSAM.git@c12dd83cbe26dffdcc6a0f9e7be2f6fb024df0ed movingpandas==0.18.1 mpmath==1.3.0 msgpack==1.0.8 multidict==6.0.5 mypy-extensions==1.0.0 nanotime==0.5.2 ndindex==1.8 ndjson==0.3.1 networkx==3.3 ninja==1.11.1.1 norfair==0.2.0 numexpr==2.10.0 numpy==1.24.4 nvidia-cublas-cu11==11.11.3.6 nvidia-cublas-cu12==12.1.3.1 nvidia-cuda-cupti-cu11==11.8.87 nvidia-cuda-cupti-cu12==12.1.105 nvidia-cuda-nvrtc-cu11==11.8.89 nvidia-cuda-nvrtc-cu12==12.1.105 nvidia-cuda-runtime-cu11==11.8.89 nvidia-cuda-runtime-cu12==12.1.105 nvidia-cudnn-cu11==8.7.0.84 nvidia-cudnn-cu12==8.9.2.26 nvidia-cufft-cu11==10.9.0.58 nvidia-cufft-cu12==11.0.2.54 nvidia-curand-cu11==10.3.0.86 nvidia-curand-cu12==10.3.2.106 nvidia-cusolver-cu11==11.4.1.48 nvidia-cusolver-cu12==11.4.5.107 nvidia-cusparse-cu11==11.7.5.86 nvidia-cusparse-cu12==12.1.0.106 nvidia-nccl-cu11==2.19.3 nvidia-nccl-cu12==2.18.1 nvidia-nvjitlink-cu12==12.5.40 nvidia-nvtx-cu11==11.8.86 nvidia-nvtx-cu12==12.1.105 oauthlib==3.2.2 omegaconf==2.3.0 opencv-python==4.9.0.80 OpenEXR==3.2.4 opt-einsum==3.3.0 orjson==3.10.3 packaging==24.0 pandas==2.0.3 pathspec==0.12.1 pillow==10.2.0 platformdirs==3.11.0 portalocker==2.8.2 prompt_toolkit==3.0.45 proto-plus==1.23.0 protobuf==4.25.3 psutil==5.9.8 py-cpuinfo==9.0.0 pyasn1==0.6.0 pyasn1_modules==0.4.0 pycodestyle==2.11.1 pycparser==2.22 pydantic==1.9.1 pydot==2.0.0 pyflakes==3.2.0 pygit2==1.15.0 Pygments==2.18.0 pygtrie==2.5.0 PyJWT==2.8.0 pyparsing==3.1.2 pyproj==3.6.1 python-dateutil==2.9.0.post0 pytz==2024.1 PyWavelets==1.4.1 PyYAML==6.0.1 regex==2024.5.15 requests==2.32.3 requests-oauthlib==2.0.0 rich==13.7.1 rsa==4.9 Rtree==1.2.0 ruamel.yaml==0.18.6 ruamel.yaml.clib==0.2.8 safetensors==0.4.3 scikit-image==0.21.0 scipy==1.8.1 scmrepo==1.4.1 seaborn==0.11.1 sentry-sdk==2.3.1 setproctitle==1.3.3 shapely==2.0.4 shortuuid==1.0.13 shtab==1.7.1 simplification==0.7.10 six==1.16.0 smmap==5.0.1 sqltrie==0.11.0 sympy==1.12.1 tables==3.9.2 tabulate==0.9.0 tensorboard==2.15.2 tensorboard-data-server==0.7.2 tensorboard-plugin-wit==1.8.1 tensorflow-cpu==2.15.1 tensorflow-estimator==2.15.0 tensorflow-io-gcs-filesystem==0.37.0 termcolor==2.4.0 thop==0.1.1.post2209072238 tifffile==2024.5.22 timm==0.9.7 tokenizers==0.19.1 tomli==2.0.1 tomlkit==0.12.5 torch==2.2.0+cu118 torchaudio==2.2.0+cu118 torchvision==0.17.0+cu118 tqdm==4.66.4 transformers==4.41.1 trimesh==4.4.0 triton==2.2.0 typing_extensions==4.12.0 tzdata==2024.1 ultralytics==8.2.26 -e git+https://github.com/lpiccinelli-eth/UniDepth.git@ab0fa3b9854b6f44573aa002bc281e9b91488aee#egg=unidepth&subdirectory=../../UniDepth uritemplate==4.1.1 urllib3==1.26.13 vine==5.1.0 voluptuous==0.14.2 wandb==0.17.0 wcwidth==0.2.13 Werkzeug==3.0.3 wrapt==1.14.1 xformers==0.0.24+cu118 yacs==0.1.8 yarl==1.9.4 zc.lockfile==3.0.post1

tkhurana-bdai commented 5 months ago

I'm facing the same error. Note that this issue did not exist 2-3 weeks back.

lpiccinelli-eth commented 5 months ago

I cannot reproduce the error: I pulled the repo (and installed it via pip) and tried the following:

import torch

model = torch.hub.load("lpiccinelli-eth/UniDepth", "UniDepth", version="v2", backbone="vitl14", pretrained=True, trust_repo=True, force_reload=True)
model = UniDepthV2.from_pretrained(f"lpiccinelli/unidepth-v2-vitl14")

Both commands run fine, my only doubt is if you are using the pulled version (pip install on the current main branch) of the repo's main branch.

kafkaGen commented 5 months ago

Yeap, suggestion is close. I clear and reclone the repo, everything work fine now. I not sure, that changed some in model configuration or so. Can this be kind of update to architecture of V2?