Hey! Thanks for your hard work creating captionr. Unfortunately, I'm seeing this error on both Windows native and WSL. Any help is appreciated!
Python version is 3.8.10.
Command:
python captionr.py /mnt/d/model_training/deltron/images/500px/people --blip2_question_file /mnt/d/model_training/deltron/captions/blip2/question.txt --prepend_text "a photo of " --existing=skip --cap_length=75 --blip_pass --use_blip2 --blip2_model blip2_opt/pretrain_opt6.7b --clip_model_name=ViT-L-14/openai --uniquify_tags --device=cuda --extension=txt
Exception:
ERROR:root:Exception during BLIP captioning
Traceback (most recent call last):
File "/mnt/d/model_training/code/captionr/captionr/captionr_class.py", line 139, in process_img
new_caption = config._blip.caption(img)
File "/mnt/d/model_training/code/captionr/captionr/blip2_cap.py", line 22, in caption
return self.model.generate({"image": image})[0]
File "/mnt/d/model_training/code/captionr/venv/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/mnt/d/model_training/code/captionr/venv/lib/python3.8/site-packages/lavis/models/blip2_models/blip2_opt.py", line 213, in generate
outputs = self.opt_model.generate(
File "/mnt/d/model_training/code/captionr/venv/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/mnt/d/model_training/code/captionr/venv/lib/python3.8/site-packages/transformers/generation/utils.py", line 1490, in generate
return self.beam_search(
File "/mnt/d/model_training/code/captionr/venv/lib/python3.8/site-packages/transformers/generation/utils.py", line 2749, in beam_search
outputs = self(
File "/mnt/d/model_training/code/captionr/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/d/model_training/code/captionr/venv/lib/python3.8/site-packages/lavis/models/blip2_models/modeling_opt.py", line 1037, in forward
outputs = self.model.decoder(
File "/mnt/d/model_training/code/captionr/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/d/model_training/code/captionr/venv/lib/python3.8/site-packages/lavis/models/blip2_models/modeling_opt.py", line 703, in forward
inputs_embeds = torch.cat([query_embeds, inputs_embeds], dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 25 but got size 5 for tensor number 1 in the list.
Hey! Thanks for your hard work creating captionr. Unfortunately, I'm seeing this error on both Windows native and WSL. Any help is appreciated!
Python version is 3.8.10.
Command:
python captionr.py /mnt/d/model_training/deltron/images/500px/people --blip2_question_file /mnt/d/model_training/deltron/captions/blip2/question.txt --prepend_text "a photo of " --existing=skip --cap_length=75 --blip_pass --use_blip2 --blip2_model blip2_opt/pretrain_opt6.7b --clip_model_name=ViT-L-14/openai --uniquify_tags --device=cuda --extension=txt
Exception:
Pip packages:
`Package Version
altair 4.2.2 antlr4-python3-runtime 4.9.3 asttokens 2.2.1 attrs 22.2.0 backcall 0.2.0 backports.zoneinfo 0.2.1 blinker 1.5 blip-vit 0.0.3 blis 0.7.9 braceexpand 0.1.7 cachetools 5.3.0 catalogue 2.0.8 certifi 2022.12.7 cfgv 3.3.1 charset-normalizer 3.1.0 click 8.1.3 cmake 3.26.0 confection 0.0.4 contexttimer 0.3.3 contourpy 1.0.7 cycler 0.11.0 cymem 2.0.7 decorator 5.1.1 decord 0.6.0 distlib 0.3.6 einops 0.6.0 entrypoints 0.4 executing 1.2.0 fairscale 0.4.4 filelock 3.10.0 fonttools 4.39.2 ftfy 6.1.1 gitdb 4.0.10 GitPython 3.1.31 huggingface-hub 0.13.2 identify 2.5.21 idna 3.4 imageio 2.26.0 importlib-metadata 6.0.0 importlib-resources 5.12.0 iopath 0.1.10 ipython 8.11.0 jedi 0.18.2 Jinja2 3.1.2 jsonschema 4.17.3 kaggle 1.5.13 kiwisolver 1.4.4 langcodes 3.3.0 lazy-loader 0.1 Levenshtein 0.20.9 lit 15.0.7 markdown-it-py 2.2.0 MarkupSafe 2.1.2 matplotlib 3.7.1 matplotlib-inline 0.1.6 mdurl 0.1.2 mpmath 1.3.0 murmurhash 1.0.9 networkx 3.0 nodeenv 1.7.0 numpy 1.24.2 omegaconf 2.3.0 open-clip-torch 2.16.0 opencv-python-headless 4.5.5.64 opendatasets 0.1.22 packaging 23.0 pandas 1.5.3 parso 0.8.3 pathy 0.10.1 pexpect 4.8.0 pickleshare 0.7.5 Pillow 9.4.0 pip 21.1.1 pkgutil-resolve-name 1.3.10 platformdirs 3.1.1 plotly 5.13.1 portalocker 2.7.0 pre-commit 3.2.0 preshed 3.0.8 prompt-toolkit 3.0.38 protobuf 3.20.3 ptyprocess 0.7.0 pure-eval 0.2.2 pyarrow 11.0.0 pycocoevalcap 1.2 pycocotools 2.0.6 pydantic 1.10.6 pydeck 0.8.0 Pygments 2.14.0 Pympler 1.0.1 pyparsing 3.0.9 pyrsistent 0.19.3 python-dateutil 2.8.2 python-Levenshtein 0.20.9 python-magic 0.4.27 python-slugify 8.0.1 pytz 2022.7.1 pytz-deprecation-shim 0.1.0.post0 PyWavelets 1.4.1 PyYAML 6.0 rapidfuzz 2.13.7 regex 2022.10.31 requests 2.28.2 rich 13.3.2 salesforce-lavis 1.0.0 scikit-image 0.20.0 scipy 1.9.1 semver 2.13.0 sentencepiece 0.1.97 setuptools 56.0.0 six 1.16.0 smart-open 6.3.0 smmap 5.0.0 spacy 3.5.1 spacy-legacy 3.0.12 spacy-loggers 1.0.4 srsly 2.4.6 stack-data 0.6.2 streamlit 1.20.0 sympy 1.11.1 tenacity 8.2.2 text-unidecode 1.3 thefuzz 0.19.0 thinc 8.1.9 tifffile 2023.3.15 timm 0.4.12 tokenizers 0.13.2 toml 0.10.2 toolz 0.12.0 torch 2.0.0+cu117 torchvision 0.15.1+cu117 tornado 6.2 tqdm 4.65.0 traitlets 5.9.0 transformers 4.28.0.dev0 triton 2.0.0 typer 0.7.0 typing-extensions 4.5.0 tzdata 2022.7 tzlocal 4.3 urllib3 1.26.15 validators 0.20.0 virtualenv 20.21.0 wasabi 1.1.1 watchdog 2.3.1 wcwidth 0.2.6 webdataset 0.2.43 wheel 0.40.0 zipp 3.15.0`