ExponentialML / Video-BLIP2-Preprocessor

A simple script that reads a directory of videos, grabs a random frame, and automatically discovers a prompt for it
MIT License
131 stars 17 forks source link

Unable to generate Captions for Videos #10

Open SakshiKhadilkar opened 8 months ago

SakshiKhadilkar commented 8 months ago

1) This model is being used in ExponentialML Text-To-Video Fine Tuning

2) Run the script preprocess.py , in which "Salesforce/blip2-opt-2.7b" is used.

3) Some issue with this function present in modeling_blip_2.py file

@torch.no_grad() def generate( self, pixel_values: torch.FloatTensor, input_ids: Optional[torch.LongTensor] = None, attention_mask: Optional[torch.LongTensor] = None, **generate_kwargs, ) -> torch.LongTensor:

The following are the errors Screenshot 2024-03-01 110532 Screenshot 2024-03-01 110942

AiSaurabhPatil commented 8 months ago

I am also facing the same issue !! image