Yushi-Hu / PromptCap

natual language guided image captioning
71 stars 7 forks source link

PromptCAP is not working on google colab #11

Open mkoursha opened 3 months ago

mkoursha commented 3 months ago

Hey there! Thanks a lot for amazing work and making it public. Unfortunately when i tried to run the code on colab, i got the following error:

TypeError Traceback (most recent call last)

in <cell line: 7>() 5 image = "/content/temp1.jpg" 6 ----> 7 print(model.caption(prompt, image))

2 frames

/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py in generate(self, inputs, generation_config, logits_processor, stopping_criteria, prefix_allowed_tokens_fn, synced_gpus, assistant_model, streamer, negative_prompt_ids, negative_prompt_attention_mask, **kwargs) 1573 if self.config.is_encoder_decoder and "encoder_outputs" not in model_kwargs: 1574 # if model is encoder decoder encoder_outputs are created and added to model_kwargs -> 1575 model_kwargs = self._prepare_encoder_decoder_kwargs_for_generation( 1576 inputs_tensor, model_kwargs, model_input_name, generation_config 1577 )

TypeError: OFAModel._prepare_encoder_decoder_kwargs_for_generation() takes from 3 to 4 positional arguments but 5 were given

The code that i tried to run is as follow:

In one cell, i run: !pip install promptcap

In other cell, i run: import torch from promptcap import PromptCap

model = PromptCap("tifa-benchmark/promptcap-coco-vqa") # also support OFA checkpoints. e.g. "OFA-Sys/ofa-large" if torch.cuda.is_available(): model.cuda()

prompt = "what does the image describe?" image = "/content/temp1.jpg"

print(model.caption(prompt, image))

Any help will be appreciated.

MohamedAfham commented 2 months ago

I'm having the same issue. Following the thread.

mohammadmirzaee25 commented 2 months ago

Still it is not working. Fix it please

yaaisiu commented 4 weeks ago

I have same issue and only thing I know - or I think I know - is that it's related to this:

tifa-benchmark/promptcap-coco-vqa
<super: <class 'OFATokenizer'>, <OFATokenizer object>>
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'GPTNeoXTokenizerFast'. 
The class this function is called from is 'OFATokenizer'.

This shows when model is loaded. Tried some solutions to change tokenizer, but failed miserably :)

civugd commented 2 weeks ago

I encountered the same issue. After I tried downgrading the version of transformers, it worked successfully. Here are the version details after downgrading:

python=3.10.14
transformers=4.29.2
numpy=1.26.4
pytorch=1.12.1
torchvision=0.13.1
pillow=9.5.0
tokenizers=0.13.3
mkoursha commented 2 weeks ago

I encountered the same issue. After I tried downgrading the version of transformers, it worked successfully. Here are the version details after downgrading:

python=3.10.14
transformers=4.29.2
numpy=1.26.4
pytorch=1.12.1
torchvision=0.13.1
pillow=9.5.0
tokenizers=0.13.3

Hey @civugd Thanks alot. It worked for me by downgrading the transformers lib.