how to generate multiple captions for an image using BLIP2

NielsRogge / Transformers-Tutorials

This repository contains demos I made with the Transformers library by HuggingFace.

MIT License

8.45k stars 1.32k forks source link

how to generate multiple captions for an image using BLIP2 #350

Closed laxmimerit closed 9 months ago

laxmimerit commented 9 months ago

Hi, I was following this notebook. https://github.com/NielsRogge/Transformers-Tutorials/blob/master/BLIP-2/Chat_with_BLIP_2.ipynb

How can I generate multiple captions for single image?

Here is sample code snippet from the demo

inputs = processor(image, return_tensors="pt").to(device, torch.float16)

generated_ids = model.generate(**inputs, max_new_tokens=20)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0].strip()
print(generated_text)

laxmimerit commented 9 months ago

Used Sampling method for solution

inputs = processor(images=image, return_tensors="pt").to(device, torch.float16)

generated_ids = model.generate(**inputs, do_sample=True, top_p=0.95)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0].strip()
print(generated_text)