Open loboere opened 1 year ago
I have found the solution, at least for myself personally: it might be a version mismatch with some of the modules, check the requirements text file, if any of those modules are newer on your system than the requirements in this file, there might be feature deprecation preventing this from running
I had this same exact error and fixed it with this command:
pip install timm==0.4.12 transformers==4.17.0 fairscale==0.4.4 pycocoevalcap pillow
It found that my timm, transformers and fairscale were on newer versions, pulled the downgrade, and got this working first try.
If you use these for anything else already and it might break functionality, it may not be worth it, unless you really need the functionality of this system.
EDIT: This error also crops up if you try to create a batch size larger than the number of image files being processed
Yeah please do pip install on an empty virtual env
my images are 256x256 pixels
/content/image-captioning 2023-09-02 18:30:18.889829: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT Device: cuda:0 Images found: 263 Split size: 263 Checkpoint loading... load checkpoint from ./checkpoints/model_large_caption.pth
Model to cuda:0 Inference started 0batch [00:01, ?batch/s] Traceback (most recent call last): File "/content/image-captioning/inference.py", line 88, in
caption = model.generate(
File "/content/image-captioning/models/blip.py", line 201, in generate
outputs = self.text_decoder.generate(
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1675, in generate
return self.beam_search(
File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 3014, in beam_search
outputs = self(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, *kwargs)
File "/content/image-captioning/models/med.py", line 886, in forward
outputs = self.bert(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(args, kwargs)
File "/content/image-captioning/models/med.py", line 781, in forward
encoder_outputs = self.encoder(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, kwargs)
File "/content/image-captioning/models/med.py", line 445, in forward
layer_outputs = layer_module(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, *kwargs)
File "/content/image-captioning/models/med.py", line 361, in forward
cross_attention_outputs = self.crossattention(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(args, kwargs)
File "/content/image-captioning/models/med.py", line 277, in forward
self_outputs = self.self(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/content/image-captioning/models/med.py", line 178, in forward
attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
RuntimeError: The size of tensor a (3) must match the size of tensor b (9) at non-singleton dimension 0