[Question] How to generate caption for images in dataset on hpc

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

https://llava.hliu.cc

Apache License 2.0

19.38k stars 2.13k forks source link

[Question] How to generate caption for images in dataset on hpc #351

Open ohhiohhi opened 1 year ago

ohhiohhi commented 1 year ago

Question

I wanna know how to generate caption for all the images in the dataset using llava in hpc, can anyone tell me?

haotian-liu commented 1 year ago

Hi, you can refer to model_vqa.py for the inferecing on all images in the dataset.

Reference instructions (the first point suffices): https://github.com/haotian-liu/LLaVA#gpt-assisted-evaluation

ohhiohhi commented 1 year ago

thanks，I tried what you said, but a new problem arose@haotian-liu

When I use model_vqa.py for caption， the generated text is garbled，i didn't change file model_vqa.py

pretrain model：llava-llama-2-13b-chat-lightning-preview

Run the following code python -m llava.eval.model_vqa \ --model-path "./llava-llama-2-13b-chat-lightning-preview" \ --question-file ./playground/test/test.jsonl \ --image-folder ./llava/serve/examples/ \ --answers-file ./playground/test/test_answer.jsonl

prompt：caption the image in 30 words or less Generates garbled results for all images，such as：c. sec: router devgn #

At first I thought it was a model issue, i tried pretrained model LLaVA-Lightning-MPT-7B-preview，the output is still a mess，

After I thought it was an environment issue, but when using llama-7b-hf for QA tasks, the output is normal

Now I don't know why.

fj6833 commented 12 months ago

Hello, may I ask if you have resolved this issue? I have the same question and would you please provide me with a contact information? Thank you

ohhiohhi commented 11 months ago

sorry, I didn't fix that, we can get in touch via my email,18191547735@163.com

jameszhou-gl commented 10 months ago

Hi @ohhiohhi , just a discussion. I ever tried to use LLaVA to generate captions on ScienceQA dataset. It works well, at least no garbled results found. Have you tried on ScienceQA dataset using llava.eval.model_vqa? I translated sqa format into vqa format firstly.

ohhiohhi commented 10 months ago

Thanks for your reply , I solved the problem by reconfiguring the environment

FurkanGozukara commented 10 months ago

Thanks for your reply , I solved the problem by reconfiguring the environment

hi what is the prompt you are giving?

@jameszhou-gl which prompt you used to caption images?

ohhiohhi commented 10 months ago

Thanks for your reply , I solved the problem by reconfiguring the environment

hi what is the prompt you are giving?

@jameszhou-gl which prompt you used to caption images?

'caption the image with keywords'